Originally published August 1994
A recent article on the subject of hazard analysis stated the following regarding medical device software: "Software failures have resulted in patient deaths, and the potential for more such hazards is growing as software becomes more complex and increasingly important in medical devices."1 If this is true, then software testing--making sure that a device works and that its hazard potential is minimized--should be a very high priority for device manufacturers. But this has not always been the case.
In fact, everyone involved in the production of device software knows the following about software testing:
- Testing never starts until the very end of the product development cycle.
- Test groups don't even see the product until it's just about completed.
- Even if the product is delivered to the test group late, the product shipping date is seldom adjusted accordingly.
- Time-to-market pressures force testers to cut corners.
- The test group relies on manual testing plans, checkoff sheets, and handwritten log books.
- Regression testing--verifying the same information over and over again--leads to tester burnout.
- Test-group boredom leads to bungled tests.
- Almost nothing of value is salvaged from tests developed for one product and used on another. Everything has to be recreated.
Thankfully, because of three factors, these practices are quickly becoming history.
The first factor is an increased concern for safety. Of course, no company sets out to build a harmful product. Not only is it unethical, it's bad business. The results of poorly tested software can be devastating, both to the patient and to the company that releases a product.2
The second factor is competition. Companies concerned about product quality and improved time to market are revolutionizing the role of the test group. They want better products, delivered faster. Modern, automated software-testing systems can help.
The third factor, of more immediate importance to medical device manufacturers, is regulatory pressure from FDA. As FDA continues to tighten its software test requirements, it is sending the message that software testing is not an end-of-project exercise in pressing keys and jotting down results. Rather, to quote an agency document, "The FDA is focusing attention on the software development process to assure that potential hazardous failures have been addressed, effective performance has been defined, and means of verifying both safe and effective performance have been planned, carried out, and properly reviewed."3
In response to these pressures, progressive software development teams are making test professionals an integral part of the entire product development process. Instead of relying on temporary workers or reluctant (and biased) developers to press buttons and manually record test results, these companies are building test organizations equipped with automated test tools and staffed with computer professionals dedicated to software testing. This strategy has two big payoffs: the testers produce better tests and the developers learn to write code that is more testable.
Before looking at these new developments, however, let's review where software testing stands today in most companies.
TRADITIONAL APPROACHES TO MANUAL SOFTWARE TESTING
There is really no such thing as too much testing. The questions are, how much is enough and what kind of testing gives the most value? The answer to the first question helps manufacturers plan the focus of their testing resources; the answer to the second helps them choose appropriate test tools and methods. A few approaches to solving these problems have been fairly standard. They include: safety testing, functional testing, error testing, free-form testing, and white-box testing.
Safety Testing. It is a given that a device must operate safely. Safety testing focuses test efforts on areas of potential harm to the patient or user. A thorough hazard analysis conducted during the design phase of product development will identify potential software safety hazards so they can be dealt with during product development.1 Close attention to safety testing becomes one way to verify that software-related hazards have been mitigated.4
Functional Testing. The device must perform according to its design; therefore, functional testing focuses test efforts on what the device is supposed to do. This type of testing is the heart of traditional validation and verification procedures. Functional testing usually receives the most attention and the largest number of test cases because each function must be subjected to exhaustive testing. Traditionally, this test goal is achieved by manually verifying as many combinations of parameter values and limit checking as is possible or practical.
To accomplish this, test cases are written to prove that every device function, as defined in a software requirements specification, does what it is supposed to do, reliably. For example, assume that the specification states that a certain entry field should only accept numbers from 1 to 999. A functional test plan might verify this by requiring the tester to enter each number and record the results. A good test plan will also direct the tester to enter various illegal combinations of letters, symbols, blanks, and negative numbers. If the device operates in different modes, the test plan may also require a set of tests for each mode. Notice that even in this simple case, the number of possible tests accumulates quickly.5
A more practical version of the test might not include all the numbers from 1 to 999, but it will include at least 0, 1, 999, 1000, illegal values, and tests for any known weak spots. Even a limited test plan should check the limits and boundaries of the field along with any values the tester thinks might cause trouble.
Functional testing is an orderly process, but it is also very time-consuming and extremely boring. Boredom often results in human error in the tests and omissions in the documentation, regardless how hard the testers try to be accurate.
Even if testing is conducted in a rigorous manner, the device itself may limit the scope and quality of manual functional testing. For example, the speed of the device and its peripherals may limit the number of tests that can be completed within the available time. Also, testers can type only so fast, and printers can only produce so many words per minute. The functioning of other physical inputs and outputs--probes, communications lines, tone generators, and other processors in the device--take time, too.
In many cases, the tester may not even be able to get at important functions using manual methods. For example, a specification might state that a therapy should stop no more than 20 milliseconds after the command is given. This time frame is just too short to be verified by observation. Testing internal timings between processors or peripherals is even more difficult. There is no way a manual test can be used in such situations.
Error Testing. The device must handle error conditions properly. Error testing ensures that the device works right when things go wrong--for example, power fluctuations or outages, internal parts failures, or nonsense values produced by peripheral devices. Tests for such errors are difficult to create and execute because it is so hard to recreate the errors in an orderly, repeatable fashion.
Most devices display messages telling the operator what's happening. However, there may be hundreds of error messages and thousands of situations requiring a response. Even if a tester has a list of all the messages, it is a daunting task to manually set up all possible situations to which a given message might apply. It is even harder to predict and test for cases in which no message is required, but one appears anyway.
As much or more time should be spent on error testing as on functional testing, but often this is not done. The time and expense involved in setting up the needed error conditions may be prohibitive. In addition, the person running the test may not have knowledge of the equipment's inner workings, which is necessary to interpret the results correctly.
Free-Form Testing. No formal test plan discovers every important software bug. Therefore, it is important to do free-form testing in addition to the formal, step-by-step approaches already described. Free-form testing subjects the device to additional stress and can even "idiotproof" it to a degree.
This type of testing defies formalization because it is not done according to product specifications. Rather, it is a process in which testers of different knowledge levels use the device in unexpected, unconventional ways in an attempt to provoke failures. For instance, the test staff might give the device to a group of naive users and watch them operate it until something breaks. (This has been called the "10,000 monkeys approach.") In another method, when knowledgeable users or test professionals discover a suspicious occurrence in a device's response, they subject that area to even greater stress. Intuition, hunches, persistence, and luck all play a part in this test process.
This type of testing is typically the most interesting for the testers. It also finds a large percentage of the bugs. Unfortunately, since it is so unstructured, there is no assurance that it will cover all potential problem areas. And after all the formal functional test plans have been executed (often several times) there is often no time left to implement free-form testing. Another problem with free-form testing is that because it is unstructured, it may be difficult or impossible to recreate the circumstances in which bugs are found.
White-Box Testing. White-box testing methods are used when the tester must verify the internal workings of the device software in order to test it adequately. Most manual methods present the device to the tester as a "black box." Testers enter user-level inputs and evaluate the device's response. What occurs in between--inside the black box--is inferred from the results.
Such inferences can be mistaken, and therefore dangerous. Functionality that cannot be observed by the tester also merits testing. With white-box testing, the tester is able to look inside the device and create tests to find weaknesses in the program's internal logic. The target is no longer opaque; it has acquired windows into its functionality.
White-box testing enables a tester to examine communications protocols, evaluate data structures, analyze timings, and perform other tests that cannot be done any other way. White-box methods can be incorporated into manual safety testing, functional testing, error testing, and free-form testing. However, this is rarely done because of the following difficulties:
- The available tools are designed to work as debugging aids, not as functional test- ing aids.
- Hooking up logic analyzers, oscilloscopes, communications analyzers, ICE units, etc., presents logistical problems.
- Software test people are not usually familiar with hardware tools that would aid in the development of white-box tests.
Table I provides examples of device features tested by black-box and white-box approaches to testing.
Several other testing-related issues should be mentioned: test coverage, test documentation, and reproducible test results.
Test Coverage. The target software must have all of its functionality tested; therefore, the test process should include methods that ensure complete coverage of the code. Ensuring coverage at the functional test level is a different process from ensuring coverage in code modules. At the modular level, in-circuit emulators and debuggers can verify coverage. At the functional level, however, the testing process must attempt to ensure that all aspects of target software are tested together fully while the device is running in real time.
Test Documentation. The device is not considered to be correctly tested unless there is proof that it was tested. All test results must be written down to show that the tests were done, that the device performed acceptably, and that no part of the test process was skipped. Test documentation should be traceable back to the original requirements.6
Reproducible Test Results. The device is not considered to be correctly tested unless the results are reproducible.
AUTOMATED SOFTWARE TESTING
Automated software testing uses computer technology to test computer software; the goal of test automation is to make the test process faster, more accurate, and more complete, and to back it up with better documentation. In brief, automated software test systems use computer technology to stimulate the test target, monitor its response, record what happens, and control the entire process.
Stimulate the Target Device. Automated testing augments the human, environmental, and internal stimulation of the target with computer simulations. For example, instead of relying on an operator to produce all keyboard input, automated testing might create a keyboard simulator to send scan codes to the test target. It also might create other simulators for other inputs (trackballs, mice, touch screens, printers, communications ports, relays, and so on).
Monitor Device Response. Automated testing augments human test monitoring with computerized monitoring systems. For example, instead of relying on the operator to see and respond to every error message on the display, automated testing might create a screen-monitoring virtual device. Then, the program would instruct the test system to respond to the message by using the keyboard simulator to enter a range of correct or incorrect responses. It would do the same for any other output peripheral (printers, LEDs, warning tones, safety switches, and so on). It is also possible to monitor and respond to software internals such as memory structures, error counts, error status, timings, and interrupt behavior.
Record the Results. Automated testing replaces handwritten test journals with computer records. For example, instead of relying on the operator to jot down what's on a target display, automated testing might create a program to capture and log the display itself and automatically record the output of every other peripheral.
Control the Entire Process. Automated testing replaces manual test plans with computer-based test programs. Both manual and automated test programs are based on a requirements specification, and both contain a set of instructions spelling out how to test each part of the target device. The principal difference between them is that a good manual test plan works well only as long as the tester remains alert. By comparison, a good test program runs automatically and repeatedly, without fatigue or error.
Elementary automated test programs are very similar to the debugging aids used by developers. More-complex test systems, however, may direct whole sets of simulation, monitoring, and recording functions.
APPLYING AUTOMATED TESTING
Manufacturers of devices with multiple processors and extensive software-controlled hardware face a formidable test challenge. Given infinite staffing, time, and target device availability, exhaustive and complete testing procedures could be developed manually using most of the testing approaches described above. However, because all software development is driven by time and budget considerations, other ways must be found to test increasingly complex devices in a manner that is safe, efficient, and complete. This is the goal of automated software testing.
Let's consider each of the test techniques and issues described above with a view toward automation.
Safety Testing. Because it is difficult and time-consuming to simulate unsafe test conditions manually, safety testing is not always completed with the same rigor and attention to detail as other types of testing. A properly built automated test system can reduce the development and execution time usually associated with safety tests. Automated simulations of hazardous conditions accelerate testing and enable multiple repetitions of test cases.
Functional Testing. The functional test is the most obvious place to begin the process of test automation. It is generally the most well-defined, time-consuming, and most tedious test procedure. Often, it is difficult for developers to get off the treadmill of running manual tests long enough to create automated tests. However, it is crucial for management to understand that attempting to automate testing only in any available spare time between manual testing is doomed to failure. To the contrary, automated testing requires a commitment of time, personnel, and equipment. Once automated techniques have been implemented, the number of people and amount of time required for functional testing decrease significantly.
Figure 1 shows results compiled by one device manufacturer as it moved from a manual to an automated testing system. During the transition, the company ran manual and automated tests side by side and kept track of the results. The figure shows the savings realized every time the company ran through the test cycle. Since regression testing often requires many repetitions of a test cycle, the advantages of automation can show up quickly here.
Error Testing. The second step in automating a testing system is usually the automation of error testing. Often, error testing is already partially covered by automating functional tests, but it still requires an operator to create errors on the equipment. While a certain amount of error testing should always be done manually, developers should consider the automatic simulation of error conditions.
An examination of the hazard analysis of a device will reveal that hazards often focus on errors. It follows that error testing should receive a disproportionate amount of the testing budget. Automating the error-generation process can actually be more beneficial than automating functional testing. First, it dramatically increases the number of error combinations that can be tested. Second, it allows tighter control over the error-production process. Testers can force errors to occur at several critical junctures while the software is running. Third, the documentation provided by a properly configured automated test system can help establish that software hazards have been dealt with properly.
Free-Form Testing. Free-form testing can also be improved by automation, although in a different manner. When automating the other forms of testing, the computer is doing what computers do best: countless repetitions of a known plan, relentlessly documenting everything correctly and completely. In free-form testing, the human is doing what humans do best: attacking a problem in a creative way, possibly never approaching it the same way twice, using experience and intuition to follow suspicious occurrences.
A computer is capable of creating random inputs and subjecting a device to stress in this manner, but it doesn't do it efficiently or well. Free-form testing should remain largely in the hands of the human tester, and should never be ignored or shortchanged because other aspects of testing are being handled automatically.
While manual free-form testing is taking place, however, developers can hook up the automated test system to the device being tested, and automatically document the test. This offers several benefits:
- If a tester uncovers a problem that cannot be recreated, developers can follow the entire trail of inputs. This does not ensure repeatability, but the information can be invaluable to the developers for debugging purposes. Also, if a device's internal data structures are being monitored and recorded, the chances of finding the problem are increased.
- Free-form tests are hard to document properly. However, automated techniques provide documentation of free-form testing almost as a side benefit. The same trail of inputs used by the developer to track down bugs can serve as excellent test documentation, proving that the test was completed.
- Free-form tests that uncover problems can be formalized and added to the automated test plan for the next regression test, enabling testers to look in new directions for problem areas.
- Because of the nature of free-form testing, many areas of the device's functionality might be ignored. By recording the results and analyzing them, developers can note the areas that have been missed and target them for attention later.
White-Box Testing. Automation of white-box testing can be extremely valuable in bringing the testing process closer to the design and coding phase of a project. Introducing an automated test system that enables an inside look at software makes it easier to carry out white-box test techniques. Tests can be devised to monitor the parts of the code the programmer knows are sensitive (e.g., overflow and near-overflow conditions, timing interactions, interrupt handling, error counts, and so forth).
These examples are a small but representative sampling of what can be done with an automated tool that enables developers to obtain a white-box view into the device under test.
Test Coverage. Test measurement, a complete software test discipline in itself, can direct testing efforts to those high-risk areas of the software that will provide maximum payback. Automated code coverage tools can identify areas of the code that have not yet been tested, providing developers with a heightened sense of security in the test's effectiveness. Test coverage that uses automated reporting provides the traceability and documentation required to ensure that all product features have been tested.
Test Documentation. Automation of test documentation provides traceability all the way from test results back to product definition. It ensures and proves that complete testing of the requirements has occurred, and it speeds up the audit process.
Since documentation takes place as the tests are being run, there is no chance that it will be overlooked. In manual testing, the tester may let test documentation slip with the intention of getting back to it later. Automated documentation will be complete, correct, and legible every time a test case is run.
Reproducible Test Results. It is difficult, and sometimes impossible, for a human to reproduce test results that are dependent on a device's internally regulated responses. An automated system that provides an independent timing reference, as well as the ability to react to real-time events on the target device, provides the reproducibility required by regulatory agencies.
TEST AUTOMATION AND SOFTWARE DEVELOPMENT
When software development is viewed as a whole, testing makes up a very large percentage of the process. Historically, however, the designing, coding, and debugging phases have received most of the resources and recognition. The last few years have seen great improvements in case tools, language development, third-party library routines, operating systems, software analysis tools, debugging aids, and so on. Truly, software development has become very sophisticated.
Although there have been some modest improvements, for most companies software testing remains largely as it was 20 years ago. Test plans are created on the fly and people follow these plans manually, recording the results they verify visually. The tester's knowledge of the product is purely functional. He or she has little or no knowledge of the software internals or where the weak points might lie.
Automated testing offers a wide range of process improvements to the medical device development cycle, from the obvious (reducing the labor required to run functional testing) to the subtle (helping product developers take action to ensure testability of the code from the design phase onward).
Automated testing can be introduced at almost any stage in product development and yield excellent benefits. The long-term goal of automated testing, however, is for manufacturers to make a major process shift in the development cycle to embrace the automated test process. A sampling of process improvements that can go hand in hand with integrating test automation early in the development cycle is described below.
Design for Testability. Product development proceeds most smoothly when test engineers are involved in the process from the start. Test engineers can coordinate with programmers during the design phase by providing testability requirements. This is a good idea even when using manual testing. Designing testability into the code is a huge undertaking, but the following technical considerations illustrate the general idea. Developers need to ask questions such as:
- How will errors be presented to the operator of the device? Is there a clear, concise method the test and tester can depend on, or is the error reporting inconsistent? What kinds of displays will be used? If the device produces audible tones, will the error tones be uniquely identifiable by their frequency or duration?
- Are there areas in memory in which error or statistical information is clustered? Any internal reporting locations may be important. An automated test station can be a very powerful debugging tool for developers. Nonintrusive monitoring can help expose timing and performance problems earlier and much more easily in the development cycle than is possible with conventional software debugging tools.
- Can state variables be created so that the test engineer will know how to pace the test process? For example, if a subsystem reports when it is stabilized, the test can wait for this signal before proceeding. In an attempt to force errors, tests can also use this signal to force input before device stabilization.
- Can the software communicate the location and size of stack, heap, or other dynamic area? Tests should exercise a device up to and, if possible, beyond its limits. Knowledge of when these data areas come close to filling can be very valuable. Often, errors caused by dynamic storage overflows may not appear for seconds or minutes after the problem has occurred. By that time, they are very difficult to retrace or recreate.
- Have critical timing windows in the software been mapped out? In multiprocessor devices, the mechanics of function coordination are often very complicated. Tests can be set up to exercise high-risk areas far more exhaustively if those areas are known to test engineers. Automated tests can force event timings that may not be possible using manual test methods.
Early Test-Case Development. Test engineers need time to create test methodologies. A good automated software test is created much the same way a good product is developed--with planning, partitioning of functionality, and adapting to changes when necessary. Test groups can spend their early time in the project creating reusable tests. This is also the time for determining general test philosophies, the most important of which include how to control the timing of testing and how to ensure the best test coverage for a particular application within the available time and resources.
These early test cases can often be tried on prototype software, even if hardware is not yet available. The engineering and test groups can work together to mock up devices and input-output areas so that testing can begin early.
It is this phase of test development that often turns the heads of the development staff. Problems are detected very early, and test groups learn how to create testing routines that accommodate the changes made to a product during its design. This process saves a considerable amount of time later in the development cycle, when the time available to make changes to the testing process begins to diminish.