Reducing Risk without Wasting Resources: Doing Risk Management Right

Illustration by iSTOCKPHOTO

Mike W. Schmidt

May 1, 2009

19 Min Read
Reducing Risk without Wasting Resources: Doing Risk Management Right

ISO 14971, “Medical Devices—Application of Risk Management to Medical Devices,” was first published in 2000.1 The risk management standard radically changed the process of understanding and controlling the risks associated with medical devices. In doing so, it presented a significant challenge to manufacturers. However, too often risk management was treated as yet another box to be checked. When that happens, this invaluable thread that can tie the entire life cycle together, dramatically improve productivity, and, most importantly, ensure the safety of the devices we produce, becomes a drain on resources dreaded by members of the design team and a potential source of failed audits. If the announcement of a risk analysis meeting elicits groans and excuses for not attending, the company's risk management process is probably inefficient and very possibly does not comply with the standard.

It would be a mistake to assume that risk management procedures comply with the standard just because a notified body or other auditor has not issued any findings. Just like manufacturers, auditors have been and still are learning what ISO 14971 requires over time. However, those auditors have been exposed to the best and worst of what risk management can be. Due to this exposure, auditors expectations of an ISO 14971–compliant system are increasingly more rigorous. To stay ahead of those growing expectations, it is critical to understand the intent of each element of risk management and how those elements provide value to the rest of the development, production, and postproduction processes.

As a first step toward achieving that level of understanding, this article looks at the process of identifying hazards, estimating the associated risks, and controlling those risks to acceptable levels.

Identification of Hazards

One of the most common sources of confusion is misunderstanding exactly what constitutes a hazard. Hazards (as defined in ISO 14971 and ISO/IEC Guide 51) are potential sources of harm.2 Too frequently, the resulting harms are included in the list of hazards. Blending the cause with the result can lead to significant confusion as the process continues.

Hazards include types of energy such as potentially harmful voltages, excessive heat, or masses. They can also include circumstances (called hazardous situations in ISO 14971) such as chaotic environments or operation by untrained or poorly trained persons. None of those hazards, in and of themselves, necessarily results in harm. But each under the right circumstances (trigger events for the purposes of this article) can become sources of injury.

Unfortunately, too many manufacturers ignore what should be the primary source for identifying hazards: product safety standards. Too often, product development processes simply treat standards identification and compliance as another task that needs to be done without thought of the purpose or value.

Device safety standards are critical throughout the risk management process. They identify commonly recognized hazards associated with medical devices. They may identify hazards related to general categories of medical devices. For example, IEC 60601-1 identifies hazards associated with most electrically operated medical devices.3 In addition, standards may identify hazards associated with specific technological aspects. The ISO 10993 series identifies and guides in the evaluation of biocompatibility hazards.4 Others identify hazards associated with specific types of devices. IEC/ISO 80601-2-30, for example, identifies hazards associated with automated noninvasive sphygmomanometers.5 Failing to use such standards as the primary input to the risk management process increases the probability that hazards might not be identified. It can also cause manufacturers to ignore the primary set of risk-reduction techniques (also contained in those standards).

Hazards alone do not have severities —injuries or harms do—and hazards don't have varying levels of likelihood. Hazards either exist for a device or do not. A hazard becomes a harm when a trigger event occurs.

Severity Estimation

As a rule, once the potential sources of harm (i.e., hazards) have been identified, determining what harms they can cause is noncontroversial. What is important to remember is that different trigger events may result in different harms (even for the same hazard), and trigger events frequently can cause different levels of harm, each with its own likelihood. For each specific trigger event, it is usually possible to identify at least one harm and severity. However, different trigger events for a given hazard may well result in more than one harm or severity.

Although identifying harms and their severity does not, as a rule, lead to disagreements in a risk analysis meeting, they can. Most often, such discussions center around the possible severities that can result. If a single trigger event could cause various levels of harm, the most severe should be selected because it would carry the greatest risk (since the trigger event determines likelihood, which would be constant).

In the end, the decision to add another line item in the risk analysis should be based on whether each harm would have an independent mitigation or risk-reduction technique. If a risk does have a different mitigation, it should be addressed separately.

Finally, although reaching consensus on the harms that could occur and the associated level of severity is not difficult, it is wise to involve a clinician for validating those determinations. Clinicians are also invaluable in understanding the use environment and how medical personnel are likely to interact with the equipment, both of which are critical to usability
engineering.

Likelihood Estimation

Those familiar with the most recent edition of ISO 14971 will note that the term likelihood has been replaced by probability in the standard. However, using the term probability in the context of risk estimation can be dangerous because it implies a level of detail and accuracy that simply cannot exist. Even if it were possible to accurately calculate the probability of a given harm, when a value was assigned (such as a number between 1 and 5) or the appropriate column or row of a risk threshold table was identified, that accuracy is lost.

In short, quantifying whether a harm could occur is always an estimate, and the term likelihood better reflects the fact that it represents an estimation. Spending significant resources to achieve high levels of accuracy with high confidence intervals is almost always a waste of those resources. Therefore, for purposes of this article, the term likelihood is used as a reminder that we are making educated estimations.

For every identified potential source of harm (hazard), there are typically multiple trigger events that could bring a hazard into contact with a person or cause a hazardous situation. Any potential harmful voltage or chaotic environment could cause a person to make an error. In either case, the result could be an injury (harm). For example, trigger events that would cause potentially hazardous voltages would include the following:

 

•An exposure of these voltages such that they can be touched.

•A failure of insulation.

•A spill of conductive fluids.

•A high leakage current (unintended flow of electric current).

 

Therefore, for the hazard of “potentially harmful voltage,” the risk analysis would branch out to identify each trigger event that could lead the voltage to cause harm. Although trigger events can be identified using any number of techniques, a variation on failure mode analysis can be one of the most useful.

While failure modes and effects analysis (FMEA) should never be used as the primary risk analysis tool, by definition, parts of that process are ideal for identifying failure-related trigger events. Traditionally, FMEA focuses on the failure of component parts of the design and identifies the resulting effect of the failure. Failure mode effects and criticality analysis (FMECA) adds the ranking of the effect in terms of severity or importance. However, in using failure mode analysis as an input to risk analysis, the resulting effect of the failure and its criticality are identified in the risk analysis. So the analysis used as an input to that process need only identify the trigger events (although they can be identified here and transferred into the risk analysis). Failure mode analysis can be applied far more broadly than simply looking at component failures.

However, even though the process of identifying trigger events can be modeled on a traditional FMEA, caution is strongly advised. By its very name and traditional application, an FMEA implies that only failures are identified. ISO 14971 clearly states that harms that result from normal operation of the equipment as well as those caused by failures must be identified and the associated risks mitigated to acceptable levels. Do not allow the similarities between a trigger event analysis and a traditional FMEA to lead the risk analysis team to focus on failures alone.

Trigger event identification (using a failure mode–type analysis) can be applied to more than the device's hardware. It can also apply to software, user interfaces (foreseeable misuse), the environment in which it will be used, manufacturing errors, poor maintenance, misleading marketing information, or even statements by salespersons. In short, all trigger events that could result in injury need to be identified, and trigger event analysis can be applied to each hazard category.
Making hardware, software, usability, manufacturing, maintenance, marketing, and sales as part of trigger event analyses is a way to organize the collection of trigger events and the estimated likelihood. Now they can all be brought together in the top-level risk analysis document.

u1f5a_Trigger_Event.jpg

Table I. (click to enlarge) A sample table identifying hazards and trigger events with their level of likelihood and level of severity.

It should be noted that a preferred method for combining the severity of an injury and the likelihood that it will occur is a tabular or graphic method. Using this method, a table is created with one axis labeled with descriptions of different levels of likelihood and the other with descriptions of the various levels of severity (see Table I). Avoiding numeric values helps avoid the traps associated with the use of numbers and calculations (risk priority number or RPN method). However, for the purpose of this article and in order to quickly and clearly demonstrate the concepts being discussed, the RPN method is used.

Using the numeric approach, each identified trigger event should be associated with the likelihood that it will occur. Likelihood is represented using a ranking of (for example) 1 through 5, with 1 being highly unlikely to occur and 5 being a relative certainty. Caution is advised when assigning likelihood rankings. Many individuals (especially those with strong mathematics backgrounds) spend extended periods of time attempting to accurately assign probabilities or to differentiate which ranking a given trigger event should be assigned. Generally such discussions are not productive considering that each ranking represents an extremely broad range of actual probabilities and, in the end, the ranking given is at best an estimate.

It is important to keep in mind that ISO 14971 requires that the level for each risk be estimated twice. The standard requires estimating the risk both before risk-reduction techniques (frequently called mitigations) have been implemented and then again after risk-reduction techniques have been put in place. It is common during the initial (premitigation) estimation of risk that extended disagreements over likelihood rankings occur. As mentioned earlier, if these discussions are focused on the accuracy of the likelihood, resources are almost certainly being wasted.

In addition, design engineers often argue that the likelihood that a harm will occur is extremely small “because it will comply with the appropriate standards.” However, compliance with standards is a risk-reduction technique and is therefore not applicable for premitigation risk. Typically, engineers are trained to associate problems with solutions. But protracted discussions about likelihood are almost always a waste of time during initial risk estimation.

Designers are not the only members of the risk analysis team that can be misled in the likelihood estimation process by their background and training. Commonly, quality engineers and regulatory personnel attempt to use resources such as the FDA medical device reports (MDR) database or the European Union vigilance database to estimate premitigation likelihood. These databases can certainly provide useful information in identifying trigger events and harms, but they have no value in estimating the likelihood that an injury will occur without a risk-reduction technique being employed. Presumably, at least, all devices on the market have had risk mitigations implemented or they would not have been allowed on the market by regulators. As a rule, the information available from such resources is a better indication of the acceptable level of risk for devices that have had all risks reduced. In other words, the devices in these databases represent broadly acceptable levels of risk assuming that the injuries that have occurred did not result in recalls, other regulatory action, or extensive lawsuits.

u1f6c_Prerisk_Reduction.jpg

Table II. (click to enlarge) Identifying trigger events and their likelihood enables manufacturers to implement risk-reduction techniques. You will fill in the blanks as you conduct your analysis.

What makes more sense in terms of efficiency is to remember that without any risk-mitigation technique employed (initial risk level), the likelihood of injury is extremely high or at least inestimable. ISO 14971 says that when we cannot estimate likelihood, we should default to the highest level of likelihood. Will this result in an unacceptable risk? Of course. But that's OK because we haven't tried to reduce it to acceptable levels yet. What we have accomplished, though, is reducing or eliminating long, drawn out, and completely unproductive arguments (see Table II). For most risks, simply identify the appropriate requirement from a consensus safety standard as the mitigation. That will make the risks acceptable (see Annex D item D.5.5 of the standard).

After the trigger events associated with each hazard have been identified and the likelihood of an event is estimated, the harms that would occur and their severity must be identified. As a rule, this phase of the process goes fairly smoothly with minimal disagreement. Generally the team can quickly agree on how badly an individual would be harmed based on the hazard.

Risk Quantification

Once the trigger event has been identified and its likelihood and the severity of the resulting harm estimated, the risk can be quantified. This can be done through any number of techniques, but this example will simply multiply the likelihood and severity.

However, it is important to remember that the resulting risk ranking has little or no meaning outside the analysis in which it is used. Although determinations from previous risk analyses for similar devices can be used as an input or guide, it cannot be directly transferred from device to device or from analysis to analysis. The only value a risk ranking provides is as a tracking method to determine the relative level of risk before and after mitigation techniques are identified and applied, and whether the value has been reduced to an acceptable level.

Risk Evaluation

u1f65_Prerisk_reduction.jpg

Table III. (click to enlarge) Risk can be quantified after the trigger and its likelihood and severity have been identified.

The steps described so far complete the initial or premitigation risk estimation phase (see Table III). The risk values derived for each harm must now be considered to determine whether it is higher than the acceptability threshold for the device. For those cases in which the risk level exceeds the threshold, risk-reduction techniques must be implemented to reduce those risks to acceptable levels.

In determining a policy for the acceptability of risk, many manufacturers have implemented a concept from an informative annex of the standard. The annex was intended only to demonstrate how society perceives risk: the three-region risk chart, which includes acceptable risks, unacceptable ones, and risks that are as low as reasonably practicable (ALARP). This chart was provided in the original edition of ISO 14971 to show that there are some risks that everyone would consider unacceptable, some that would be thought of as acceptable by the general public, and, of course, some on which reasonable individuals might disagree. It was never meant to be a model for determining the acceptability of risks. ALARP is intended to be used in risk management only as a policy requiring that some risks that qualify as acceptable (but that are close to the acceptability threshold) should be further reduced if possible.

The standard requires that risks be identified only as acceptable or unacceptable. Introduction of a third undefined level of acceptability adds confusion but no value. In most cases in which the concept of ALARP has been misapplied, it is used to label this middle region. Processes using this method typically say that risks in the ALARP zone require a risk-benefit analysis. They also say that risks below this zone are broadly acceptable and require no risk reduction, and that risks above ALARP are unacceptable. The problem with this approach is that it eliminates risk-benefit analysis as a tool for many risks such as those associated with high-risk but high-reward procedures (e.g., open-heart surgery). However, if a simple two-region risk acceptability model is used, risks are either acceptable and need not be reduced further, or they are unacceptable and action must be taken to either reduce the likelihood of the trigger event (most common) or the severity of the harm. Risk-benefit analysis then becomes what it was intended to be by ISO 14971: a way to show that otherwise unacceptable risks are acceptable only because significant benefit is provided that could not exist without that risk.

It should also be noted that when risks are near but still below the acceptable risk threshold, ISO 14971 says that we should evaluate whether additional risk mitigations can be
employed.

This concept is intuitively obvious. As discussed, all risk estimations are just that—estimations that inherently carry the potential for error. When those errors might result in a risk that is acceptable but near the acceptability threshold, additional reduction in that risk is advised to ensure an adequate margin. A historic evaluation (as required by subclause 3.2 of the risk management standard) can provide insight on the accuracy of your risk management process and help in determining when additional mitigations are appropriate for otherwise acceptable risks.

Risk Reduction

For those risks that are unacceptable based on the manufacturer's policy or method for evaluating risk, risk-reduction techniques must be implemented to reduce the level of risk to acceptable levels. Those techniques include design features to reduce the likelihood of the trigger event or the severity of the harm that would result. Where design solutions are not practicable, manufacturers might implement guards that prevent access to the harm. When neither design solutions nor guards are practicable, warnings may be provided through labeling or instructions. Remember that warnings are an acceptable solution for reducing risk only when design solutions or guards (which could be considered a design solution) are not reasonably practical. Resorting to warnings without documenting why design or guard solutions are not reasonable may lead to objections by regulators and auditors. Moreover, in court, the manufacturer may be characterized as having resorted to a perceived cheap solution rather than taking appropriate action to protect patients, clinicians, and
bystanders.

However, if the manufacturer has effectively used device safety standards, it won't need to start a desperate search for practical risk-reduction techniques. The standards used to identify hazards associated with the device provide risk-reduction techniques for each of those hazards. Furthermore, ISO 14971 says that when manufacturers comply with the requirements of those safety standards, the hazards (and risks) associated with each requirement are presumed to be broadly acceptable and no further mitigation is required. This means that for each hazard and associated risk derived from standards, compliance with the requirement is identified as risk mitigation and sets the postmitigation risk level well within the company's acceptable risk range.

Note that this approach has not estimated the postmitigation likelihood in these cases; it is not necessary. The severity of the harm (determined in the premitigation analyses) and the resulting risk level (broadly acceptable) are known. Therefore, the likelihood that would drive the level of risk can be calculated. When standards are used properly, hazards, harms, and mitigations are identified. This eliminates the need to make likelihood estimates either pre- or postmitigation. For most medical devices (depending on the number of standards available for the device), well over 90% of the hazards and the levels of risk can be identified and reduced to acceptable levels thoroughly, and with minimum time expenditure.

The techniques outlined here won't eliminate the need to think “outside of the standard” and identify hazards and risks that are unique to a device being evaluated. New features and creative solutions to common problems always have the potential to give rise to unique risks that must also be made acceptable.

Conclusion

ISO 14971 requires that manufacturers identify any new hazards and risks that might have been created by risk-reduction techniques. For virtually all of the risk reductions derived from standards, there will be no additional risks created (or they would be addressed by other requirements in those standards). However, when manufacturers develop their own unique risk mitigations, this type of analysis is critical.

In the end, thoroughly understanding the risk management process and its purpose as well as implementing standards as an integral part of that process, can significantly improve both the effectiveness and efficiency of a device company's risk management efforts.

Mike W. Schmidt is principal consultant and owner of Strategic Device Compliance Services (Cincinnati).

 

References

1.ISO 14971, “Medical Devices—Application of Risk Management to Medical Devices” (Geneva: International Organization for Standardization [ISO], 2007).

2.ISO/IEC Guide 51, “Safety Aspects—Guidelines for Their Inclusion in Standards” (Geneva: ISO, 1999).

3.IEC 60601-1, “Medical Electrical Equipment—Part 1: General Requirements for Safety and Essential Performance” (Geneva: International Electrotechnical Commission, 1995).

4.ISO 10993, “Biological Evaluation of Medical Devices” (Geneva: ISO, 2007).

5.IEC 80601-2-30:2009, “Medical Electrical Equipment—Part 2-30: Particular Requirements for Basic Safety and Essential Performance of Automated Non-Invasive Sphygmomanometers” (Geneva: ISO, 2009).

Copyright ©2009 Medical Device & Diagnostic Industry

Sign up for the QMED & MD+DI Daily newsletter.

You May Also Like