Independent organizations and FDA establish standardized test methods for materials and devices. The independent organizations (e.g., AAMI, ASTM, ISO, USP, etc.) create voluntary consensus standards. These may then become recognized by FDA, with some 1300 standards now found in the Recognized Consensus Standards database. For some of these standards, FDA includes a Rationale for Recognition statement with the curious language that the standard “is recognized on its scientific and technical merit and/or because it supports existing regulatory policies.”
Recognized standards may in turn be referenced in guidance documents. Once a standard is recognized, and certainly if it is mentioned in a guidance document, there is an expectation if not quite a requirement that a manufacturer will test their device to the cited standard as part of any pre-market submission. For example, biocompatibility testing references the FDA’s guidance on the use of ISO 10993-1. Similarly, for shelf life and sterility, FDA cites AAMI/ANSI/ISO 11607-1. FDA has also announced, but not yet implemented, a testing-based alternative to substantial equivalence by direct comparison. When device types and applicable performance criteria have been identified under this program, a manufacturer will be able to reach the market by meeting the FDA-identified performance criteria instead of by comparison to a predicate device. These performance criteria may be based on existing or new standardized tests.
The underlying assumption for all such testing is that the standardized test correctly captures some or all of the relevant criteria for establishing the necessary performance of a medical device. When this is true then such standards force desirable testing, which is a good thing. There can also be a desirable version of designing to the test, which is also a good thing if the test is a good one. The existence of a recognized test also simplifies the manufacturer’s task of deciding what to test and how to test it, and it aids regulators in reviewing test results because they will be familiar with the test method and the expected results. The results of such testing, if shared, also enhances the meaningful communication of test results and may help consumers select a device of choice. However, this also raises the question of the meaning of numerical results with respect to the relative merits of different devices.
There are also disadvantages to standardized testing. Perhaps the most important of these is that if the test is not appropriate for ensuring reasonable safety and efficacy, then following and passing the test will have limited meaning. In such a case, the fact that the device passed the test gives false assurance that the device is properly designed. When the test is not appropriate or adequate, designing to the test is a bad thing in that it may drive a design in the wrong direction or at least not in the right direction. A simple example here is when human factors testing is not done under proper simulated use conditions. If this is the case, then the ability of test subjects to properly and consistently use a device would not reflect what might happen under the real conditions of clinical use where clinicians could be busy and distracted. In this regard the theoretical ability to use a device correctly is not the same as real people in the real use environment using it correctly. Another issue is that a test may not look at the right thing. For example, testing a “safety” sharp may confuse the ability to operate the safety mechanism in the comfort of a conference room with the ability to do it while dealing with a stressed patient. In addition, there may be confusion between the sharp end being covered and the process of covering it.
Another potential problem with standardized testing is that it could stifle innovation. This can occur when the characteristics of a new device do not meet the explicit or tacit assumptions that were made in creating the test. Some tests may limit applicability to particular types of materials (e.g., ceramic or metallic). In the area of materials, a property determination may assume that while bending, the material demonstrates a classic linear deformation curve. If a new material does not follow this expectation, then testing it will be challenging, and explaining why it couldn’t be tested, or explaining odd results if it is tested, could be even more challenging. For example, one ASTM test for spinal components defines the “bending stiffness” as the slope of the initial linear elastic portion of the load versus total displacement curve. But what if the material is highly non-linear and does not have a discernible initial linear slope? Another standard for joint wear requires any “new or different material” should not have a wear rate worse than cobalt-chrome-molybdenum against ultra-high-molecular-weight polyethylene, but there is no basis for this pair having the minimum acceptable rate. Standardized tests can also be self-serving to those who helped develop them, especially when they reflect proprietary interests and are tailored to existing device characteristics. In this regard I once worked on a standard for adult portable bed rails during which a manufacturer argued (unsuccessfully, I am glad to say) against a safety provision that their product couldn’t pass.
So good, proven, and independent tests are good, but weakly validated tests are bad.