Developing medical devices in today's budget-constrained world requires the use of cost-effective methods. Compliance with the quality system regulation (QSR) requires the use of verification and validation techniques, with testing as a key activity. Even with the best specification and design processes, testing plays a critical role in attaining desired product quality. Given the typically high-effort allocations to test specification, execution, and documentation, development dollars can be saved by using new and better techniques. Among these is session-based testing.
Testing can include tasks ranging from those performed by a design engineer on the laboratory bench to verification testing of embedded software. Testing can also include design validation as well as critical agency testing to ensure safety and electromagnetic compliance. By focusing on the testing process for software-intensive devices, our experience indicates that test effectiveness is often compromised by an overemphasis on one or more of the following factors.
Streamlining. Given tight budgets for design validation testing, test engineers are encouraged to design streamlined test protocols. Standard methods of requirements partitioning, leading to fewer and simpler protocols, often result in testing that is highly compartmentalized. This can cause engineers to miss defects whose discovery requires operation across several product features. Streamlined protocols can lead to test setups that are common across multiple protocols. The user-interaction variety that might be found in field operation is missed, along with the testing of potentially defective paths.
Documentation. Given the QSR-based mindset of rigorous documentation, test engineers spend a great deal of time on documentation “housekeeping.” Defects can be overlooked if testers use their budgeted time on meticulous documentation at the expense of time spent exercising the device through hands-on testing.
Requirements Coverage. Design validation testing is always focused on requirements coverage. Driven by the standard process requiring traceability between test protocols and the primary requirements, this late-stage testing is largely intended to demonstrate a fulfillment of requirements. While defects may be found, design validation testing does not emphasize or demand defect discovery.
Code Coverage. Code coverage—the goal of having the testing exercise a large portion of the software—is often emphasized as the critical metric in verification testing. Tackling coverage from a systems-level approach is expensive and tedious. As a result, test teams are driven to test at a software-module or unit level. Focusing on this level of granularity, a variety of overall system states and preconditions can be missed. If the emphasis is on achieving coverage, a test can cover all source code statements or all decisions, yet still miss significant interactions between various features and internal software states.
Repetitive Retesting. Retesting after code corrections or the issue of a new revision is often limited to previous test paths. If new features are added, the test entry paths used previously are often reused because they are well understood by the test engineers. Additionally, repeating a known user sequence is efficient in an operational sense. The drawback is that such retesting reduces the chance of finding problems—especially any new problems generated by the new features.
|Figure 1. Documentation overhead versus testing innovation (Click to enlarge).|
A Two-Pronged Approach. Although requirements fulfillment is a necessary part of testing, it is easy to overlook the equally important goal of finding defects. Defect discovery requires innovation that hammers away on functionality with the intent of breaking the code. Assuming a goal of releasing a product with few embedded defects, creative approaches to discover the most defects in the allocated time period—especially early in the development life cycle—are the most effective deployment of test resources. Test management in a rigidly structured, process-based environment fails to emphasize innovation in finding defects. Rather, emphasis is placed on getting items done according to standard operating procedures.
Development managers faced with the combined responsibility of complying with the QSR and having few device failures must consider a two-pronged approach that both addresses traditional requirements-based testing and explores instrument functionality.
Session-based testing represents a methodology that is functionality-driven rather than requirements-coverage-focused. Experience shows it can be used effectively in device development to find defects that can be missed by other test methods, even when those are rigorously applied. Session- based testing encourages creative thinking in “breaking” the device and, thus, uncovering defects before field deployment. As illustrated in Figure 1, when compared with familiar testing forms, it emphasizes innovation and reduces documentation overhead.
Session-based testing can be used in combination with standard methods as an additional defect filter tuned to explore functionality. It also can be used in situations outside the normal development pathway. Our experience shows it to be an effective approach in the device development setting because it has the following attributes:
• The method operates within a documented process structure.
• It supports work allocation management consistent with the concept of planning before execution.
• It emphasizes defect discovery over coverage and supports following intuitive thinking on what might lead to defect discovery.
• It sanctions diversity in the test team. Trained test engineers are important, but others can be trained quickly to be effective testers.
• It includes documentation of defects, metrics that support the testing effort, and features related to the medical system covered by testing.
• It can find defects that traditional methods may miss, such as system stress situations and subsystem-to-subsystem interactions.
• It produces records that complement but do not replace verification documentation.
The following overview describes the basic methodology and identifies opportunities where it might be used. This article also describes how the methodology is implemented, with notes derived from actual experience.
|Figure 2. Framework for a session-based approach to device testing (Click to enlarge).|
Understanding Session-Based Testing
Jonathan Bach originally described session-based testing methodology and its use with application software.1 Bach's approach was linked to exploratory testing during which testers are encouraged to try anything that comes to mind. Exploratory testing can be effective in finding defects. The method's unstructured approach, however, can result in wasted effort if performed by a team, because several testers may unknowingly explore the same area. Bach extended the utility of this testing by defining an interactive structure and supporting templates that ensure test deployment is carefully managed and that defects are captured.
Figure 2 illustrates the framework for a session-based approach to testing a device. Key points to be addressed include the following:
Test Mission Plan. Because session-based testing can be applied at different points in the development life cycle, high-level planning ensures that the goals of the testing at a given point are well stated. Early planning establishes the intent of the testing effort with specific details provided on the configuration to be tested, the management of defects, and the allocation of test resources. This work typically provides a test mission plan that acts as a contract between the test manager and the organization. Depending on the situation, this work may be necessary to obtain funding and resource allocation.
|Table 1. Session-based application opportunities (Click to enlarge).|
Ongoing Planning. Effort assessment and the allocation of the effort into different test sessions must be done initially, and should be updated throughout the mission. The planning work entails issuing charters to individual testers describing the goals to be accomplished during a test session. Because testers are permitted to explore opportunities or tangents and may not complete a charter's goals during the time allocated, the test manager operates with some flexibility to ensure all originally planned goals are accomplished. Further, opportunity testing results and reports of possible opportunities may result in subsequent charters being issued for other sessions.
Tester Resources. Testers must be equipped to execute a charter during a test session. This is generally achieved through study supported by training on device operation; review of requirements documentation; review of field-bound information, including user's manuals, in-service training, and labeling. It is also helpful to have access to a domain expert who is familiar with the clinical environment, intended device use, and typical modes of failure for similar devices. The domain expert acts as a consultant to all members of the team, including those who define the charters and evaluate unanticipated device behavior.
Drawing on a diverse group of testers, the individuals doing session-based testing may have totally different and useful approaches to the device versus test engineers who might have been exposed to the device during the early development work.
Test Sessions. Test sessions are specific time periods set aside for testing consistent with a charter. They require access to the medical device and other accessories that might be needed for the unit to function. Testers may make notes, but do not focus of the rigorous logging that might be found with design validation testing. A test session might be no more than an hour and a half.
|Figure 3. Example of a functional breakdown for an infusion pump (Click to enlarge).|
Debriefing. Each test session is followed by a debriefing session involving the tester and the test manager. This session allows the manager to be made aware of several things:
• Anomalies found by the testers, including details on the steps leading to failure.
• Device operation issues that suggest points to improve but that do not represent incorrect behavior. For example, a tester might indicate that accomplishing a task was quite awkward.
• Charter-fulfillment feedback describing how much of the scope was fulfilled.
• Opportunity testing results that suggest the need for additional charters.
• Possible opportunities for paths that were not explored but are perhaps important and might require additional charters.
The debriefing should occur immediately following the test session. Given the typical scheduling concerns that can arise when several sessions run concurrently, the tester and the manager work to ensure that the debriefing occurs the same day as the session. While a debriefing session can provide details of anomalies in device function, it can also help identify errors in supporting material, such as user guides and labels.
Summary Report. The test sessions continue until all charters are completed or until the contracted time is exhausted. A report is then produced to summarize critical details. The report should include coverage of the intended mission, anomalies and defects found during the testing, opportunities identified but not explored by the testers, and metrics such as the time spent in each of the device's functional areas.
Comparison with Usability Testing. Session-based testing is somewhat similar to usability testing, which uses methods that are intended to evaluate the utility of the user interface.2 The two testing forms are similar in that the methodologies rely on user documentation, are often broken into separate small test periods, and can use the talents of a diverse test group. They diverge, however, in that the goal of usability testing is to evaluate the utility, consistency, understandability, and error proneness of the user interface, whereas session-based testing is more global in seeking to uncover defects in all aspects of device operation.
Test activities vary in effectiveness for detecting device defects. Considering the overall success of uncovering defects within a testing stage, the best approaches only reach effectiveness levels approaching 40%. Although the authors are not aware of any academic test-effectiveness trials that apply the exact methodology reported here, experience with both usability testing and session-based testing suggests that the metrics for session-based testing should resemble those for usability testing. Considering this, a review of published metrics supports the authors' opinion that the effectiveness of session-based testing is greater than that for either unit testing or design validation testing.3 Strictly considering the reported metrics, it appears to be only slightly less effective than beta testing.
Opportunities for Applying Session-Based Testing
Session-based methodology can be applied at different points within the product development cycle. Several situations are described in Table I. As the following section on implementation illustrates, certain aspects of this testing approach make it attractive to senior management. Among these are:
• The method is simple to understand and to present.
• It is being used today by other device manufacturers.
• Its procedures entail low start-up overhead.
• It can be used by testers with a broad range of experience.
• It provides demonstrated utility in identifying defects other approaches miss.
• It offers wide applicability, as illustrated in Table I.
Getting Started with Session-Based Testing
|Table II. Interactions during session-based testing (Click to enlarge).|
Implementing session-based testing begins with a test manager, a test plan, and a test team. The test manager writes a plan describing the functional areas to be covered, starting at a high level and then digging deeper. Inputs to the planning include documents describing device operation (such as a user's manual), functional specifications, and the device itself. Unlike formal validation planning, there is no attempt at formal traceability.
It should be emphasized that the goal is to maximize time spent testing. While the test manager is busy with the top-level plan, members of the test team can familiarize themselves with the device, acquire necessary hardware, set up a defect-tracking system, and produce a test log template, among other related activities. The result of the planning process is a functional breakdown of the device. For example, a generic infusion pump might include some of the functions identified in Figure 3.
As soon as a plan has been developed, the test manager begins writing high-level charters for sessions to test the functional areas. Each charter is a statement of intent—a mission statement for a test session. To help generate meaningful charters, the test manager may elicit the help of a domain expert who can explain the users' goals related to the device and how users expect the device to behave. Charters typically are geared for test sessions approximately 90 minutes in length. Nevertheless, time intervals may range from about 45 minutes to 2 hours. An example charter for the infusion pump might be:
This session will take a close look at the max. bolus limit, and check its effect on boluses of various types (quick, extended, etc.). Various settings, including extents of ranges, will be checked.
Each session is assigned a predefined functional area. This categorization is necessary to track the amount of testing devoted to each functional area. Although categories were identified in the initial test plan, they may evolve as testing proceeds and the focus moves to more-detailed areas. Likewise, the charters for sessions may migrate from the general to the specific as testing proceeds. Initial charters focus on exploring the higher-level functionality of the device. As areas of interest are uncovered, the charters usually become more specific.
|Figure 4. Example of an infusion pump test session log (Click to enlarge).|
Prior to starting a session, the test manager gives the tester the charter for the session. Beginning with the test charter, the tester uses intuition, creativity, and training in attempting to break the device software. The tester is generally expected to stay within the charter, but has the freedom to pursue an interesting anomaly or unexplored area that might be uncovered during testing. Time spent deviating from the charter is called opportunity time and is fully encouraged. If more of the session is devoted to opportunity time than charter time, the session's charter may be changed to reflect this. The original charter is retained for a future session.
The tester documents progress through the session by recording details of the functional areas tested, specific tests conducted, and anomalies observed. In running a test, the tester may encounter some issues or problems. The tester logs any such issues or questions raised during the test, as well as any ideas or suggestions for future sessions. In addition, the tester records the distribution of the test session time from two perspectives. The first distribution shows the percentage of time spent on test design and execution, defect exploration, and setup. The other distribution distinguishes between charter time and opportunity time.
These metrics provide a window into how effectively testing time is being utilized. The log generated by this process is superior to those that usually result from ad hoc testing because it is immediately reviewed for content and consistency. In addition, the log is kept in a centrally accessible location. Figure 4 provides a sample of how this information might be captured in the infusion pump example.
The test manager debriefs the tester after each session. The purpose of this debriefing is to ensure that the test manager understands the tests performed and approves the results and observed anomalies. The test manager requests that the tester clarify the test results as necessary, and enter the defects into the tracking system. In addition to standardizing the reports across the test team, and ensuring that the desired test coverage is achieved, this process is also critical in helping the test manager plan further testing.
As testing progresses, the test manager uses the metrics collected to help measure progress against the plan, the amount of effort being applied to the various device features, the number of defects being reported, etc. This information is needed when answering questions from management, such as “How much testing is left?” and “How many more defects will we find?”
How Much Testing Is Left? Once testing has begun, the amount remaining may be estimated by multiplying the average session time by the expected number of sessions remaining. The number of sessions remaining may in turn be estimated by looking at the number of charters remaining and factoring in the time spent lately on opportunity testing. (More opportunity time suggests more undocumented charters remain. Less opportunity time suggests fewer undocumented charters remain.)
How Many More Defects Will We Find? As testing progresses, the test manager can more accurately estimate the total number of defects expected. A first-order approximation is to multiply the number of defects per session by the number of remaining sessions. In reality, the defect discovery process is not linear with time; it tends to be faster at first and slower at the end, as the focus of the sessions becomes more specific. Thus, the approximation may be reduced if recent defect-discovery rates show an overall reduction of the trend.
The test effort yields a summary report, a set of defect reports, and a set of session reports. In addition, the metrics collected provide important insight into the quality of the testing performed and of the product itself. The test manager may summarize metrics of interest, such as:
• Time spent on each functional area.
• Time spent on charter versus opportunity versus defect characterization.
• Session completion rate over time.
• Defects per functional area.
• Defects discovered over time.
• Effort per defect.
The quality of the testing performed and the documentation collected will depend on the skill of the individuals doing the tests. Session-based testing provides the structure for organization and management, but it still relies heavily on the contributions of the team members. For that reason, care must be taken in choosing the test manager and the testers.
The test manager generates the charters, assigns them to testers, and debriefs the testers after each session. In addition, the test manager decides which observed anomalies qualify as reportable defects, suggests tests the testers might try, calibrates the testers to the process expectations, and reports results to management. This role requires a person who can take a “big picture” view of the product, combine that with an effective management process, and mentor a team to produce consistent, usable output.
For their part, the testers must be comfortable acting on their own initiative—outside of a script—to exercise the features of the product in a demanding way. At the same time, they must be open to direction and correction from the test manager. They have to be creative in generating test cases on the fly. They must also be able to effectively summarize what they've done. Experience in software testing is helpful, but training can substitute to some extent. The test team should be composed of people who are observant, focused, and skeptical about software working properly.
An Interactive Process
Session-based testing is inherently interactive. The testers interact with the device to find out how it works, and the test manager and testers discuss the results in debriefing sessions. Even more interactions become apparent when management of the overall test mission is considered. At the start, someone proposes that using session-based testing is a good idea for a given application. That idea is formalized into some kind of proposal that indicates scope, deliverables, and the expected cost and time frame. Once the mission sponsor approves the proposal, the test manager expands the proposal into a test plan and writes the initial charters. Testing commences, and the plan and charters are updated as needed. Along the way, the mission manager holds regular meetings, which provide a forum for discussing team-wide issues and progress. Finally, when testing is completed, the test manager and mission manager write a summary report and provide it to the mission sponsor. Table II summarizes these interactions.
Experience with session-based testing provides additional lessons that can prove to be useful.
• Session-based testing is especially effective if the testers are independent from the ongoing development and verification and validation effort. Fresh eyes see more problems.
• During the planning stage, functionality should be broken down to a fine grain. This will provide better estimates and better team coordination. On the other hand, forming the team should not be delayed until every “i” is dotted on the test plan. The test manager should write a few top-level charters and then flesh out the functionality description while the testers get started.
• Debriefing sessions should start with one debriefing per test session. For simple applications or when experienced testers are involved, however, the debriefing frequency may be halved after the process is established.
• Inherently formal test protocols, such as those used to measure volumetric output of the infusion pump from the example above, are best left to formal methods. Effective session-based testing lends itself to areas with heavy user interaction and outcomes that can be confirmed quickly.
• A session-based test management tool available free as a download from Satisfice (www.satisfice. com/sbtm/) has been found to be useful in automating metrics reporting.
• Test managers can make up blank session logs and fill them in with a charter. That way, they are ready for the tester to use.
• Beyond the debriefing sessions, it is helpful to have daily team meetings, especially for the first few days. At these meetings, the test manager or testers can raise any issues that might affect everyone. This is a good way to iron out the process for each person trying to use it and to share knowledge among the testers about the test device or system.
Estimating the Effort
To generate an initial estimate of the time needed, the test manager should analyze the functional breakdown of the device, paying close attention to the complexity of the various functional areas. The test manager uses that information to roughly calculate the number of sessions that will be necessary to cover all aspects of functionality. The test manager should then add 50–100% more sessions, to represent opportunity testing and discovered new charters. Testers will be able to complete only one or two sessions per day at first, but should be able to increase this number to three per day after a week or so. The test manager must allow for about 10 minutes of debriefing per test session. This limits the number of testers reporting to the test manager to approximately 10 to 15, but the team size may tolerate some growth later. Of course, the estimates should be refined as the project progresses.
To some extent, the work done may be scaled to fit the available time and resources. For example, the test manager could break the functions down into categories according to safety risk, importance to the product, or new or repaired features. Then, the most important functions may be tested first, and testing of lesser functions can be omitted. That way, management has the opportunity to balance the remaining test effort against the remaining project risk.
Experience suggests that session-based testing is a solid methodology that can be easily targeted to the QSR-compliant environment. Used with the common techniques associated with verification and validation, it permits a two-pronged approach that emphasizes both requirements- and functionality-driven testing. Further, session-based testing provides a cost-effective means of identifying defects. The method is simple to understand, easy to implement, and has demonstrated effectiveness in ensuring the viability of medical device software.
1. J Bach, “Session-Based Test Management,” Software Testing and Quality Engineering (November/December 2000).
2. C Engelke and D Olivier, “Putting Human Factors Engineering into Practice,” Medical Device and Diagnostic Industry 24, no. 7 (2002): 60.
3. TC Jones, Estimating Software Costs (New York/McGraw-Hill, 1998), 554.
Copyright ©2003 Medical Device & Diagnostic Industrya