Originally Published MDDI November 2003
Product Development Insight
Talking medical devices can enhance the way users interact with such products. But how can you ensure that your device is actually helpful?
Michael E. Wiklund
American Institutes for Research
How do you feel about products that talk to you? Do you appreciate automobile navigation systems that direct you to “Turn left ahead”? Or an automated receptionist that invites you to “Listen carefully to the menu options because some have been changed”? And what about talking bottle openers?
If you're like most people, you find some of these things helpful and others annoying. What seems to be the distinction? Whether the voice is a useful aid or just a noisy gimmick? This simple criterion can be applied to talking medical devices as well.
|Michael E. Wiklund|
So far, medical device manufacturers seem to have taken a smart, disciplined approach to giving their products a voice. There are some excellent examples of talking devices that enhance the way users interact with- them. Voice-enabled medical devices are leading users through some particularly challenging tasks, breaking down barriers to independent use by people with impaired vision, and improving usability as a whole.
Automated external defibrillators (AEDs) are an impressive example of voice prompts enhancing user interactions with a medical device. An AED is designed for use in an emergency when a victim may require resuscitation. The latest-generation AEDs use voice prompts to guide users through the numerous steps required to perform a successful rescue. Spoken instructions, such as stay calm and check breathing, spare the rescuer from having to read textual or graphical instructions, which would be more time-consuming and could create confusion. Clearly, the use of voice prompts in an AED is no gimmick. People who have used devices such as the Zoll AEDPlus (see Figure 1) respond well to the voice prompts, finding them both helpful and reassuring during a stressful event.
Talking glucose meters are another exemplary application of speech technology, but focus on a different goal. The value of a talking glucose meter, such as the Roche Diagnostics Accu-Chek Voicemate (see Figure 2), is that diabetics who have vision impairments or total blindness—common outcomes of the disease—can use it independently. A visually impaired person simply needs to follow the device's spoken instructions to apply blood to a specially designed test strip, insert [the] strip into the device, and listen for the numerical result. This natural means of interaction could also serve the needs of people who have cognitive impairments. Or it might be preferred by people who simply would rather listen to information such as result is 64 milligrams per deciliter, than to read it on a small display. Of course, a user might choose between the two modalities if the device incorporated both a display and voice output.
About the Technology
As in designing other types of user interfaces, there is both an art and a science to producing a well-spoken medical device. Designing a good one is a matter of balancing technical and user-centered needs.
Choosing the right technology is one of the more straightforward design tasks. For medical devices that only talk (i.e., that do not have voice recognition capabilities), developers can choose between digitized and synthesized speech technologies. Digitized speech usually sounds much better because it is based on recording a real person's voice, then playing back the right segment at the right moment. The technology works best with medical devices requiring a relatively small vocabulary. According to Ward Hamilton, vice president of marketing at Zoll Medical (Burlington, MA), digitized speech is well suited to the firm's defibrillator. The device is intended for use by laypeople who have received training on the fundamentals of resuscitation as well as on use of the company's AED, as recommended by the American Heart Association.
Once Zoll engineers determined the specific voice segments needed to guide a user through the numerous steps of assessing a victim's status and delivering a shock and/or CPR, the remaining task was to choose the right voice talent to make the messages. The designers ultimately chose a man with an authoritative-sounding, medium-pitched voice to create the recordings; his résumé included narrating for Nova, the science-oriented public broadcasting program. The Software installed in Zoll Medical's computer-driven defibrillator holds the voice segments in memory and plays them back according to a rigid protocol.
“We deliberately made the prompts terse based on our general understanding that lots of verbiage is hard to deal with in an emergency,” Hamilton says. “People who use our device have stepped into a situation where someone appears to have died. It's very stressful. We don't want to give them too much to remember . . . [or they] become numb to it.” Developing effective prompts “is a tremendous balancing act that requires lots of customer feedback,” Hamilton adds. “In our case, we need to provide enough information so that users recollect their basic life support skills and CPR training.” Hamilton estimates that they received feedback from more than 2000 people over several years of development.
Synthesized speech differs sharply from digitized speech in terms of both its sound quality and its application. A synthesized speech segment sounds exactly the way the term suggests: synthetic. A computer, rather than a human, is doing the talking, and you can tell. Often, synthesized speech sounds nasal, emotionless, and strangely accented. Pitch variations, or the absence of them, can also make synthesized speech seem unnatural and hard to decipher.
In the typical application, a software program generates a phrase or sentence to be spoken by the computer. Next, the computer strings together the sounds associated with letter combinations in the words. Over the past decade, voice synthesis technology has improved considerably. Artificially generated speech now sounds more human, and has become more intelligible.
|Figure 1. The Zoll AEDPlus uses voice prompts to guide users through the steps required for a rescue.|
Nancy Lonsinger is vice president of marketing at Roche Diagnostics (Indianapolis). She says that synthesized speech was the best solution for the AccuChek Voicemate blood glucose meter because of the variability of the device's spoken output. For example, the Voicemate speaks the value of the glucose measurement and the identity of insulin vials. It reports back to the user his or her blood glucose level, which can range widely in value, from 10 to 600 mg/dl. Synthesized speech clearly makes sense in this case. It would be impractical to record all of the possible combinations of words and values, some of which would be unknown at the time of device production.
“Before the Voicemate, visually impaired patients would have to wait for assistance from a family member or other helper to test their blood,” Lonsinger says. “Now, they can perform the task by themselves.” She describes the Voicemate as an important addition to the company's line of blood glucose testing devices, addressing the needs of a special market segment.
The device is priced in the range of $400–$500, making it 5 to 10 times more expensive than standard blood glucose meters aimed at the general consumer market. The product's high price is reportedly due to its high development cost and relatively low sales volume. “Voicemate has been wildly successful within its relatively small market,” Lonsinger notes, “but it hasn't generated substantial financial rewards for the company.” Still, she says, “Voicemate has generated considerable goodwill toward the company. It also gives our employees a deep sense of satisfaction because we are helping people take care of themselves.”
A hybrid technical approach to producing medical devices that talk involves building whole sentences from individually digitized words. In such applications, the sentence “Your temperature is ninety eight point six degrees Fahrenheit” is produced by stringing together the words Your, temperature, is, ninety, eight, point, six, degrees, and Fahrenheit. One can add inflection to specific words to make the final sentence sound more natural. Accordingly, a device's vocabulary may include multiple versions of the same word, each one spoken with a different inflection to sound, for example, inquisitive versus directive.
Another speech technology of note, albeit a rudimentary one, is just another version of digitized speech. It's the voice recording and playback technology found in stuffed animals, greeting cards, and picture frames. Millennium Compliance Corp. (Southington, CT) is a small company started by John Dobbins, a pharmacist originally from the University of Connecticut (Storrs, CT). The company has put a relatively inexpensive recording and playback device into its Talking RX Prescription Reader (see Figure 3).
The cuplike device, which attaches snugly to the bottom of a medium-sized medication container, enables the pharmacist to record essential information about the contents (normally found on a pill bottle's paper label). The device, which is sold over the counter, is targeted toward visually impaired consumers, who may have previously relied on others to read the information on their pill bottles. Millennium's product can record a 60-second segment. However, Dobbins says, “the 60-second capacity is usually far more than pharmacists need to record the vital information.”
|Figure 2. Roche Diagnostics' AccuChek Voicemate can report a blood glucose level aloud.|
There are some general guidelines on the design of effective voice prompts that can help ensure simpler, one-way applications of speech technology.
Learn about the User. User-centered design philosophy involves device users throughout the user-interface design process. Designing the voice-based user elements of a medical device is no exception. Developing effective voice prompts calls for substantial input from the intended users. For starters, users can provide valuable feedback on the concept of using speech at all. They can then make suggestions about wording and provide feedback on prototype designs.
Ensure Proper Task and Information Flow. It is helpful to chart the flow of the voice prompts and associated device and user-related actions. This way, you can be sure that the process is logical and intuitive, thereby helping to avoid confusion and use errors. In addition, device makers should identify the likely use errors and ways to recover from them.
Prompt Users at a Suitable Pace. Prompts should not outpace users' ability to listen, understand, and follow instructions. Conversely, the pace should not be so sluggish that users lose their concentration, become annoyed, or are unnecessarily delayed from performing urgent actions. Dobbins's advice to people recording a talking prescription is to speak slowly and deliberately, particularly if the patient has diminished mental capacity.
Synchronize Prompts with Actions. It is easy for prompts to get out of sync with user actions, particularly if the user does not perform tasks in the anticipated order. Thus, designers need to consider the full range of possible user behaviors to determine the best way to maintain synchronization. Ultimately, it may be necessary for users to press a button to indicate that they are ready to progress to the next step.
Make Prompts Sufficiently Loud. Some medical devices are used in quiet environments, such as a bedroom or office, while others may be used in louder environments, such as a factory floor. Designers should consider incorporating a volume control. Setting a device to emit prompts at the maximum required volume may make them too loud. However, in cases such as an AED, it may be better to keep the device's user interface simple by excluding a volume control and presetting the sound level so that prompts are intelligible against typical or worst-case background noise. Further, ensure that the sound-production hardware has enough power to achieve the required sound levels.
Use Plain Wording. Even a sophisticated user is better served by plainly worded prompts. Simple wording usually gets the point across faster and avoids confusion, particularly in cases where users must divide their attention between performing tasks (e.g., checking for breathing) and listening for the next instruction. Plain wording also promotes better understanding among people who have limited vocabularies and nonnative speakers. One strategy for developing plainly worded prompts is to analyze people giving instructions to each other, or so-called natural dialogues. The assumption is that natural-sounding prompts will be easier to understand and follow.
Be Consistent. When designing voice prompts, it is important to employ consistent terminology and syntax. Avoid inconsistencies such as those reflected in the following
• “To start the test, press the red button.”
• “Push the green key to stop the test.”
Note the arbitrary use of the terms “button” versus “key” and the different sentence structures, which may complicate matters for some users.
Keep Prompts Short. People are better at following short prompts than long ones that may cause them to forget details. Assume that users have a short attention span. However, take care not to make prompts so terse that they fail to communicate their point.
|Figure 3. The Talking RX Prescription Reader employs a recording and playback device.|
Provide Clear Direction. When trying to move a task along efficiently, there is no room for extraneous detail. Unnecessary detail can obscure the primary message. For example, aircraft warning systems emphatically state “Pull up!” if a plane is going to crash into the ground. This prompt is clearly superior to “The plane is descending toward the ground at a dangerously high rate of 100 feet per second.” The more-detailed prompt never actually tells the pilot what to do in the face of an imminent hazard. So, particularly in the case of emergency treatment devices, get to the point quickly. Thus Zoll Medical's AED states “Don't touch the patient” instead of “Do not touch the patient because an ECG analysis is in progress or about to begin.”
Ensure a Suitable Tone. An emphatic-sounding prompt may be appropriate when guiding a task that must be performed quickly and correctly, such as attaching an AED's electrodes to a specific spot on a victim's chest. However, the same tone might not be appropriate for a blood glucose meter intended for daily use.
Designers also need to consider the ramifications of recording speech passages using a particular tone of voice, including variables such as pitch and inflection. Tone is easier to manipulate when dealing with digitized speech, as compared with synthesized speech. In most cases, a polite, nonjudgmental-sounding voice is warranted.
The speaker's gender is a key variable. In some cases, designers may want to give users the choice of a male-sounding versus female-sounding voice. Some evidence suggests that a female voice is more attention getting, but either a female or male voice can do the job.
Ultimately, developers should probably choose a voice with a tone that a majority of the users can relate to. Avoid an especially high-pitched female voice or a low-pitched male voice, which may be off-putting to some users and may be clipped by the recording and reproduction technology, which is likely to deliver low-fidelity sound.
Ensure Proper Translation into Other Languages. Avoid translating prompts word for word because syntax and common word usage vary widely among languages. It is better to take a more holistic approach to translations, ensuring that the final prompts sound natural to native language speakers. Accordingly, it may be best to engage native speakers to perform the translations into their first language.
Provide Alternative Prompts. For medical devices that may be designated for exclusive use by either laypersons or medical professionals, it may make sense to offer alternative sets of voice prompts. Each set can be tailored to speak the language of the target user population. The set intended for use by medical professionals would employ common medical jargon, while the set intended for use by laypersons would not. In certain cases, developers may want to leave open the possibility of tailoring prompts to a particular customer's needs or demands.
Validate the Prompts through User Testing. A set of prompts may look good and logical on paper, but they may not work when presented in context. Therefore, plan to conduct a usability test of the prompts at several stages of development. Keep refining the prompts until they promote the desired user behavior and, in appropriate cases, the users like them.
Medical devices that talk sound like progress. However, as with most enabling technologies, there are perils to avoid. Voice prompts should be reserved for cases where they truly enhance user interactions. Guiding users through an emergency procedure during which attention is split between device interactions and direct patient care seems to be a good application. Enabling people with visual impairments to use medical devices without assistance is another. Voice prompts may be universally beneficial to all users because they can use their eyes and hands for other tasks. But medical device developers should guard against overusing the technology. The world does not need a bunch of chatty medical devices that speak out for no compelling reason.
Copyright ©2003 Medical Device & Diagnostic Industry