Note: This article is based on the white paper "Establishing Bioburden Alert and Action Levels" available for download.
Most national and international standards regarding bioburden, sterilization, or environmental testing recommend establishing alert and action levels to demonstrate continued control over a process or product.
ISO 11737-1:2006 provides guidance on establishing bioburden alert and action levels.1 Clauses A.8.5–8.7 give general guidelines for setting environmental or bioburden levels. This guideline does not dictate how to use the data to establish action and alert levels, nor does it provide guidance on how to interpret the data depending on the sterilization modality in use.
The standard for radiation sterilization, ISO 11137-2:2006, assumes that dose audits are being performed quarterly. Clause 10.1 states
A review of environmental and manufacturing controls, together with determinations of bioburden should be conducted in conjunction with sterilization dose audits. If the review indicates lack of control, appropriate action should be taken.
No definition is provided for the phrase “indicates lack of control.” It is clear that some criteria should be established. Most companies comply with this requirement by establishing alert and action levels for bioburden and environmental counts.
There are many factors involved in establishing bioburden alert and action levels in a variety of situations.
The 11737-1 document discusses the fact that bioburden data seldom fit into a normal distribution (i.e., a bell-shaped curve). In evaluating bioburden data consider whether it is important that the data fit a standard statistical model (e.g., normal distribution). That the data fit a standard statistical model is less critical than whether the established levels are based on empirical data and whether they provide safety from a sterilization perspective.
One primary reason that bioburden data do not fit a normal distribution is due to bioburden spikes. It is common to obtain most bioburden values near the mean but also to occasionally have a value that is well above the mean (i.e., a bioburden spike). Bioburden spikes are common in the medical device industry, especially with manual assembly.
The other main reason for bioburden data not fitting a normal distribution is because of the frequent occurrence of zero colony forming units (CFUs) results (e.g., <1 CFU per sample). Standard distribution in this case may be zero; thus, use of standard distributions is impossible, and a different approach is required.
Alert and Action Levels
For alerts and actions, some use the term limits rather than levels. The term limit implies that a product has been effected by an excursion above that value. Use of levels does not imply that the product has automatically been impacted and is generally preferred. A search into established documents and standards provides definitions regarding alert and action levels (or limits). 2–5
Alert Level. Indicates when a process might have drifted from normal operating conditions. An investigation may be performed and corrective action may be implemented, but no action is required. It can be assumed that repetitive excursions above the alert level may be addressed as if it were an action level.
Action Level. Indicates that a process has drifted from normal operating conditions. An investigation must be performed and corrective action must be implemented.
Manufacturers are responsible for setting their own internal specifications for bioburden and environmental alert and action levels. Alert and action levels should be used as a means to monitor manufacturing processes and not as stand-alone product acceptance criteria.
Neither alert nor action levels should be based solely on environmental or bioburden counts without considering the method of sterilization and the amount of overkill in the cycle. In setting the levels there should be a balance between demonstrating adequate control over bioburden without frequently triggering the alert and action levels.
Setting levels is not purely a mathematical exercise. It also involves looking at the proposed levels with common sense.
TNTCs, Spreaders, and Spikes
A bioburden or environmental agar plate may have growth covering the entire surface where distinct colonies cannot be enumerated. These are usually called too numerous to count (TNTC) or spreaders. TNTC describes individual colonies that are indistinguishable because of high numbers of colonies on the filter or plate. Spreaders describe one or more colonies that have covered a portion of or the entire filter or plate. Spreading can be caused by particulates, the nature of the microorganisms, or by fluid on the filter or plate.
TNTC results should not be assigned a CFU value. Using an assigned value beyond the countable range, such as 300, would likely result in an underestimation of the bioburden. In review of any bioburden data, a TNTC result likely indicates a bioburden problem and signals an investigation. The investigation may call for additional testing.
Spreaders do not allow for an accurate count. The count should be discarded when gathering historical data to establish bioburden levels. Spreaders generally indicate a problem with the test method.
Occasionally, spikes are observed in bioburden testing. Currently there is no harmonized definition for a bioburden spike. One common definition is an individual value that is greater than or equal to twice the mean.
Spikes or outliers should be investigated. If they are not true values, then either that value or the entire data set should be discarded. If the investigation determines that they are true values, there may be a bioburden problem in the manufacturing or testing process. It is unwise to set alert and action levels while such a problem is present. If possible, the cause of the spike should be identified and corrected. If this is not done, infrequent spikes could eventually become more frequent.
As part of the investigation, determine whether the spike value raises a potential concern regarding the ability of the current sterilization cycle to provide product that is sterile to the desired sterility assurance level (SAL). This evaluation varies depending on the sterilization mode used and on the bioburden counts at the time of the validation (see Table I).
|Table I. Notice the spike in sample 4 in this example of bioburden data.|
Each product or product family should be evaluated and established independently, based on historical trends.
When establishing levels for a new product, use initial or temporary levels until enough data are gathered to establish long-term levels.
Initially test the samples more frequently (e.g., weekly or monthly) to establish a baseline. With these baseline data, temporary alert and action levels can be established. Testing on a typical basis (e.g., quarterly) for the remainder of the year will result in sufficient data for determining long-term alert and action levels.
Three initial sets of data representing three batches can provide a good statistical basis for temporary levels. Use of the same mathematical approaches for establishing temporary versus long-term levels is appropriate with the understanding that the temporary levels may be triggered more frequently.
Create a plan for setting long-term alert and action levels. It should cover the transition of temporary to long-term levels and the frequency of reevaluation.
Once sufficient bioburden data have been gathered, long-term alert and action levels should be established. When gathering data, consider the following to ensure that sufficient data representative of the product have been gathered:
Samples should represent the entire lot. If a manufacturing batch is made specific for testing, extra care must be taken to ensure that the testing batch is representative of routine manufacturing. 2
Bioburden data should be gathered over an extended period of time. It is typical to gather data over one year. 2
At least four sets of data should be used. As more data are gathered, the margin of error decreases. For example, one set of 10 samples per quarter of the year (40 data points) generally provides sufficient trending to establish levels.
Employ a validated recovery efficiency for product bioburden levels. A recovery efficiency validation should be performed for each sample product type (e.g., minimum of three samples) and applied to all data points before data evaluation begins. If multiple recovery efficiencies are determined over time, take the mean of all recovery efficiencies and add them to each set of data. Applying the same recovery efficiency to all data provides for less variation when comparing bioburden estimates and is applicable as long as the same extraction method is used for each set of data. In the bioburden standard, derive the correction factor from the recovery efficiency.
Using standard deviations to set levels is a simple and easy approach. A misleading argument against using standard deviations is that microbiological data may not fit a normal distribution. However, the standard deviation is a useful measure of the dispersion of the data, even if data are not normally distributed.
As a larger sample size of bioburden data becomes available, a move toward a normal distribution may not always be seen. Although a larger sample size could result in a normal distribution of microbiological data, the presence of even a single very high value could result in the data not being normally distributed.
Additionally, a larger sample size of bioburden may not necessarily move toward a normal distribution if there is no growth (e.g., 0 CFU observed). In this situation, the sterilization method may be used to establish the alert and action levels. Another option is to use other distributions and their corresponding statistics to establish levels. Although low bioburden data are said to follow a Poisson distribution, in our evaluation of 47 data sets of product with high bioburden, the Poisson distribution was generally not found.
It is not desirable that the alert level be triggered often, as that would be an indication that there is either too much variability in the bioburden results or that the alert level is too low.
It is best to use the bioburden estimate to establish values rather than bioburden averages or maximum values. This would require that a recovery efficiency be validated for each product type to calculate the bioburden estimate. For environmental monitoring, the bioburden average would be used because a recovery efficiency is generally not performed.
Table IIa: This table shows bioburden data-monthly monitoring. Three initial sets of data representing three batches can provide a good statistical basis for temporary levels.
Table IIb. From a bioburden perspective, a comparison of the first three months (See table IIa) versus the entire year shows the bioburden estimate and bioburden estimate plus standard deviations are similar. This demonstrates that, as the manufacturing process was refined over time, there was not a significant change and the bioburden is similar.
Initial Evaluation of the Data
Tests are usually performed monthly for the first quarter, then quarterly for the rest of the year. Using bioburden data from the product in question, the mean, standard deviation, and bioburden estimate for each set can be calculated as well as the overall mean, average standard deviation, and average bioburden estimate. The sum of aerobic bacterial and fungal data for each sample could be used in all calculations.
Additional calculations were performed to determine the bioburden estimate plus two and three standard deviations as well as the bioburden estimate times 10 (see Table II, parts a,b, and c).
From a bioburden perspective, a comparison of the first three months versus the entire year might show that the bioburden estimate and bioburden estimate plus standard deviations are similar. This would demonstrate that, as the manufacturing process was refined over time, there was not a significant change and the bioburden is similar.
Using standard deviations to establish the bioburden levels is similar to the “normal distribution approach” in PDA TR13.2 The alert level can be set at two standard deviations above the historical bioburden estimate, and the action level can be set at three standard deviations above the historical bioburden estimate. This approach results in tight alert and action levels, which would be appropriate for bioburden-based methods such as radiation.
|Table IIc. The established recovery efficiency of 58.7% was applied to all data for consistency. This is appropriate, because all testing was performed using the same extraction method.|
For radiation sterilization using VDmax, there is an established bioburden count that should not be exceeded, which is the maximum bioburden count permitted in the sterilization table being used in ISO 11137-2 and ISO 13004.1.1,6 For example, for 25 kGy, the maximum allowable bioburden count is 1000 CFUs. This would be an example of when the term limit might be appropriate.
When establishing levels for overkill-based methods (e.g., EtO), alert and action levels could be based using the bioburden estimate + 3 × standard deviations and bioburden estimate × 10, respectively. A good limit for such products using overkill methods could be when the bioburden approaches or exceeds the titer of the biological indicator. The amount of safety provided in overkill cycles should allow for greater flexibility in the alert and action levels.
Evaluation of Data Normality
The statistical analysis system (SAS) PROC UNIVARIATE was used to evaluate the normality of 47 different data sets (10 samples per data set). The following four different statistical tests were used in these evaluations: Shapiro-Wilk, Kolmogorov-Smirnov, Cramer-von Mises, and Anderson-Darling. Each of these tests did agree reasonably well on the determination of the normality of each of these data sets. Many data sets were found to be abnormal due to a single outlier, which was more than twice the standard deviation beyond the mean (33 out of 47, or 70%).
In the evaluation of these data sets, it was determined whether each data set had a single outlier, which was defined as a single data point more than twice the standard deviation beyond the mean (i.e., mean + 2 standard deviations). Most data sets were either abnormal due to an outlier or normal due to no outlier (39 out of 47 or 83% were deemed abnormal). This result demonstrates that using the SAS program may be a simple and reasonably accurate way to determine whether a given data set has an influential outlier.
When the initial sets of 10 were grouped into sample sizes of 20 or 30 data points (based on product type) and evaluated as described earlier, the data did not become normal solely based on a larger sample size. In fact, the rule of thumb described worked with each of the data sets in determining normality (100%).
There are many factors involved in establishing alert and action levels for product and environmental bioburden. A thorough review of bioburden data can assist in selecting the best approach for the situation. The approaches discussed here have functioned well for a variety of product and sterilization types.
There is often discussion in the industry regarding the appropriateness of standard distributions for evaluating bioburden data. Fitting the bioburden data into a specific statistical distribution is less critical than understanding the ranges of bioburden over time.
An important part of this process is having a good definition for alert and action levels and understanding what should occur when each is triggered. Different sterilization types should require different numerical levels as well as specified follow-up actions.
- SO 11737-1:2006, “Sterilization of Medical Devices—Microbiological Methods—Part 1: Determination of a Population of Microorganisms on Products” (Geneva: International Organization for Standardization, 2006).
- PDA Technical Report 13, “Fundamentals of an Environmental Monitoring Program” (Bethesda, MD: Parenteral Drug Association, 2001).
- USP <1116>, “Microbiological Control and Monitoring of Aseptic Processing Environments” (Rockville, MD: United States Pharmacopoeial Convention, 2012)
- USP <1231>, “Water for Pharmaceutical Purposes” (Rockville, MD: United States Pharmacopoeial Convention, 2012)
- Code of Federal Regulations 21 CFR 211.
- ISO 13004, “Sterilization of Healthcare Products— Radiation—Substantiation of Selected Sterilization Dose: Method VDmaxSD” (Geneva: International Organization for Standardization, July 2013 [not yet published]).
Martell Winters is a senior scientist at Nelson Laboratories, where he has worked for 18 years. He has been involved in writing AAMI/ISO and AATB documents for 15 years. Winters is a registered microbiologist and the specialist microbiologist.
Esther Patch is study director II for Nelson Laboratories. She graduated with a degree in chemistry with an emphasis in biochemistry and a minor in biology from Erskine College and Seminary (Due West, SC). She is a national registered biologist.
Wendy Wangsgard is bioburden department scientist and has been with Nelson Laboratories for eight years. She is involved with the radiation sterilization, microbiological methods, sterility assurance level, and other working groups of AAMI.
Harry Bushar is an independent statistician currently employed part-time by FDA. He has served as a member of the AAMI Radiation Sterilization Subcommittee and the Gamma Radiation Sterilization Working Group.
Ashley Ferry is a quality assurance investigator at Nelson Laboratories. Ferry also audits testing to ensure compliance with cGMPs, ISO/AAMI, USP, and internal SOPs and STPs. She is a registered microbiologist.