Lee Heydrick and Kenneth A. Jones
A familiar term from advertisements for products and services, reliability is the probability that an item will perform as intended, under specific conditions, for a specified time. It can be measured in several ways, the most common being probability of success and mean time between failures (MTBF), which is calculated by dividing the total performance time of a group of items by the total number of malfunctions incurred.
In turn, reliability engineering is the discipline of quantifying a product's reliability requirements, ensuring that practices are in place to design in reliability, and verifying that reliability requirements are met through design analysis and testing. It has long been a critical component of the design and production of electronic and electromechanical equipment in the aerospace and defense industries, and it seems logical that its proven techniques also should be routinely applied in the design and development of medical devices, where the consequences of failure may be patient injury or even death.13
As the complexity of medical equipment continues to increase, it becomes critical to focus on reliability during product development rather than attempt to test reliability into a product after the design is complete. Designed-in reliability can be accomplished most effectively by integrating reliability engineering activities with other design engineering tasks throughout all phases of product development.4,5 As reasonable as this concept may sound, it has not been uniformly applied by the medical device industry.1
There is a perception that reliability engineering adds significantly to product development costs. In fact, when proven reliability engineering techniques are applied during the design and development phases, the resulting benefits far outweigh the costs. Optimum equipment reliability, and the associated reduction in operating and maintenance costs, can typically be achieved with an increase in development costs of less than 5%. Conversely, the consequences of poor reliability--degraded equipment performance, excessive repair costs, patient safety hazards, and even patient fatalities--are well documented.6,7
The reliability of the various types of equipment used during open-heart surgeries to manage myocardial protection is one area of critical concern. The perfusionist, who is responsible for the setup and operation of the heart-lung machine, must be confident that the system assembled not only can support the preferred protocol and deliver the prescribed cardioplegia solution, but also can respond to changes during the course of surgery as the patient's condition dictates. This article describes the reliability engineering process that was used during the development of a device that provides cardioplegia delivery and real-time management of myocardial protection. The reliability tasks that were implemented are identified; additional detail on how to execute these tasks is available elsewhere.2,8,9
The cardioplegia delivery system provides for multiple combinations of blood, crystalloid, arresting agent, and additive solutions under the programmable control of the perfusionist. These features reduce the probability of human error and contribute to reduced operating costs, an important feature in today's cost-conscious health-care environment.
THE RELIABILITY PROGRAM
As emphasized above, it is important to apply established reliability engineering techniques concurrently with the product design and development phases. In the case study, the underlying theme of the reliability program was to identify potential problem areas and to incorporate effective preventive measures to preclude the occurrence of hardware or software failures. The primary elements of the program were the establishment of quantitative reliability and safety requirements, design analysis, and verification of reliability through analysis and testing.
Reliability Requirements Definition. It is common practice at the beginning of a product development project to define detailed quantitative performance criteria (based on customer needs) in a specification document. Ideally, these specifications should include quantitative reliability requirements. For the cardioplegia delivery system, such quantitative requirements were established, with consideration of several clinical scenarios, for three reliability specifications: the probability of successfully completing an operation without a system performance failure, the probability of a safety-critical failure, and a minimum operating life. These requirements were then reviewed by a panel of cardiovascular surgeons and perfusionists, and changes were incorporated based on their feedback. Once the overall system requirements had been established, a quantitative requirement was assigned for each subelement of the delivery system. For the purpose of these requirements, a typical surgery is assumed to consist of a 2-hour setup, a 90-minute surgery including 30 minutes of cardioplegia solution delivery, and a 2-hour shutdown.
Design Analysis. Parts selection was based on the defined reliability and operating life requirements, as well as other performance parameters. To increase the probability that inherent design reliability will be maintained throughout the life of the system, parts with an established reliability history were used whenever possible. Components also were subjected to additional burn-in (environmental stress screening) when necessary to eliminate marginal parts and early-life failures.
High reliability in electronic equipment is generally achieved by limiting the electrical, mechanical, and environmental stresses (voltage, current, temperature, duty cycle, and so forth) that are applied to each part during normal operation. To ensure that the operating stresses for each design application were well below the various components' maximum rated operating values, derating criteria were defined for each part type used in the cardioplegia delivery system. (Derating is limiting the use of a part to conditions that are less severe than the maximum levels specified by the part manufacturer.) Generally, the ratio of operating stress to rated stress was limited to 50%, which represents a 100% design margin. For example, a maximum voltage of 25 V will be applied to a capacitor that is rated for 50 V. Complex parts such as microcircuits have multiple derating criteria, covering such parameters as output load current, frequency, fanout, and power dissipation, which are typically held to 8090% of maximum rated levels.
Mechanical and electrical stress analyses were performed by measuring or calculating the stresses to which each part will be exposed, determining the applied stressto rated stress ratios, and verifying that the ratios met the derating criteria. The key parameter that affects reliability was analyzed for each part type--for example, power dissipation for resistors, junction temperature for semiconductors, and applied voltage for capacitors. Parts that did not satisfy the criteria were replaced with similar parts that did, or a design change was incorporated to reduce operating stress.
Because lowering temperatures increases part reliability, a thermal analysis was performed to identify any hot spots, eliminate high temperatures, and improve overall heat dissipation. The internal thermal profile of the delivery system was fully characterized and airflow patterns were evaluated. Improved thermal management techniques, such as additional heat sinks and increased airflow in the power supply, were incorporated to enhance system reliability.
Failure mode, effects, and criticality analysis (FMECA) and fault-tree analysis are commonly used during medical device development to assess potential safety hazards.10 FMECA is the evaluation of potential part failure modes, the effects each failure would have on unit performance, and the criticality of degraded performance relative to safety or system functions; fault-tree analysis is the logical diagramming of single and combined failures that could create a potential hazard. For the cardioplegia delivery system, potential fault conditions that would cause major performance degradation or significant equipment maintenance were identified as part of the FMECA process and their occurrence minimized. In addition, the probability of a safety-critical failure mode occurring was determined for each part, the individual probabilities were combined, and the resultant probability of a safety-critical system failure was compared with the predefined quantitative delivery system requirement.
A software reliability analysis was also a key element in the reliability program for the cardioplegia delivery system. Until recently, hardware safety and reliability received much more attention than did software. However, the application of reliability techniques to software development is critical if satisfactory system reliability is to be realized in automated devices.11 The core elements of the software reliability program that was implemented during development of the operating software for the cardioplegia delivery system were the following:
- Documented performance requirements.
- Design and coding standards.
- Quality practices and standards.
- Code inspections.
- Requirement tracing.
- Simplified design.
- Reliability modeling to predict operational reliability.
- Extensive testing beginning at the module level.
- Periodic audits and audit trails.
- Documentation and resolution of all defects.
- Safety hazard analysis.
- A clearly defined user interface.
Product development programs usually include some type of design review prior to release of a design for manufacture. For the cardioplegia delivery system this process included a review of the reliability requirements that were defined in the final product specification and of the design's ability to meet those requirements. Adherence to design criteria were discussed and any exceptions to guidelines were resolved prior to approval of the design.
Reliability Verification. The cardioplegia delivery system design was subjected to testing and analysis to verify that both the hardware and software met specified reliability requirements and that the system could be produced without degrading its inherent reliability.
A reliability assessment is the use of historical data on part applications and failures to predict the expected inherent reliability of a system. For the cardioplegia delivery system design, this was done by assigning a failure rate to each electrical, electromechan-ical, and mechanical part, with the value of the failure rate dependent on the part's operating stress and duty cycle. (The failure rate of a part operating at 90% of its maximum value can be eight times higher than that of the same part operating at 30% of its rating.) The part failure rates were derived from sources such as MIL-HDBK-217F, the RAC handbook for nonelectronic parts, and the Bell Communications Research Reliability Manual.1214 These rates were then combined at the assembly, subsystem, and system levels to predict the device's inherent reliability, which is expected to be approximately 20% higher than the specified requirement. In addition, the individual predicted failure rates were compared to the specification document to ensure that each element of the system met its reliability requirement. In some cases, additional design changes or part reliability improvements were incorporated to meet the overall design goals.
Reliability testing was integrated with other development-phase tests to create an overall integrated test plan for the delivery system project. Such early, well-planned testing can provide assurance that system reliability will be achieved in production.15 Testing of the system prototype units included accelerated-life testing to verify the system's ability to perform satisfactorily throughout its expected 10-year operating life and reliability growth testing (RGT) to identify potential failures, determine their causes, and then take corrective action to prevent failure recurrence. (Accelerated-life testing is the evaluation of life expectancy by subjecting an item to combined stresses well in excess of those expected in normal usages, while RGT involves evaluation of the device's performance in extreme specified-usage environments.) Additional RGT was conducted on preproduction systems to verify the effectiveness of the corrective actions as well as to ensure that inherent design reliability is not degraded by manufacturing processes and to identify possible additional reliability enhancements. This integrated "test, analyze, and fix" approach is effective because after malfunctions are documented, analyses are performed to identify the root cause of each failure, corrective action is taken to prevent failure recurrence, and the effectiveness of the corrective action is verified with additional testing.
The reliability of the system's software was verified by combining analysis and auditing with thorough formalized testing. As with the hardware, all delivery system software defects were documented, their causes determined, and changes incorporated to prevent their recurrence. Sufficient regression testing--the repetition of previously completed tests--was performed following each modification to verify the effectiveness of the change and to ensure that no other errors had been introduced.
Once the device enters the marketplace, user feedback will be evaluated and changes incorporated to ensure that all customer expectations are satisfied. This approach will continue the reliability verification process in the actual-use environment, so that system reliability will continue to increase.
Applying reliability engineering techniques during the product development process can help ensure the reliability of medical devices. A reliability effort such as that described above, coupled with a comprehensive quality assurance program during production, can result in a device that will perform as expected and meet the stringent reliability requirements of critical health-care applications such as open-heart surgeries.
Lee Heydrick is principal of the Heydrick Consulting Group (Denton, TX), which provides reliability and quality engineering services. Kenneth A. Jones is vice president of research and development for Quest Medical, Inc. (Allen, TX).
1. Bell DD, "Contrasting the Medical-Device and Aerospace-Industries Approach to Reliability," in Proceedings, Annual Reliability and Maintainability Symposium, Washington, DC, Institute of Electrical and Electronics Engineers, pp 125127, 1995.
2. "Best Practices: How to Avoid Surprises in the World's Most Complicated Technical Process: The Transition from Development to Production," NAVSO P-6071, Washington, DC, U.S. Department of the Navy, March 1986.
3. Dhillon BS, "Reliability Technology in Health Care Systems," in Proceedings of the International Association of Science and Technology for Development (IASTED) International Symposium, Computers and Advanced Technology in Medicine, Healthcare and Bioengineering, Anaheim, CA, ACTA Press, pp 8487, 1990.
4. Greenberg HP, "Achieving Product Reliability," 44th Annual Quality Congress Transactions, Milwaukee, American Society for Quality Control, pp 398403, 1990.
5. Heydrick L, "Effective Reliability Engineering during Product Development," in Fifth Annual Leesburg Workshop on Reliability and Maintainability Computer-Aided Engineering in Concurrent Engineering, New York, Institute of Electrical and Electronics Engineers, pp 183188, 1991.
6. Joyce E, "Software Bugs: A Matter of Life and Liability," Datamation, May 15, pp 8892, 1987.
7. Leveson NG, and Turner CS, "An Investigation of the Therac-25 Accidents," Computer, pp July, 1841, 1993.
8. RADC Reliability Engineer's Toolkit, Griffiss Air Force Base, NY, Rome Air Development Center, July 1988.
9. Reliability, Maintainability, and Supportability Guidebook, 2nd ed, Warrendale, PA, Society of Automotive Engineers, 1992.
10. Elahi BJ, "Safety and Hazard Analysis for Software-Controlled Medical Devices," in Proceedings of Sixth Annual IEEE Symposium on Computer-Based Medical Systems, New York, Institute of Electrical and Electronics Engineers, pp 1015, 1993.
11. Leone AM, "Practical Techniques for Ensuring Software Reliability," presented at George Washington University, Washington, DC, August 2729, 1991.
12. Military Handbook, Reliability Prediction of Electronic Equipment, MIL-HDBK-217F, Washington, DC, U.S. Department of Defense, December 1991.
13. Nonelectronic Parts Reliability Data, NPRD-91, Densen W, Chandler G, Crowell W, et al. (eds), Rome, NY, Reliability Analysis Center, 1995.
14. Reliability Manual, SR-TSY-000385 (Issue 1), Redbank, NJ, Bell Communications Research, June 1986.
15. McLean H, "Exceeding the Limits of Traditional Reliability Tests," Med Dev Diag Indust, 16(4):96100, 1994.