These days most of us have tampered with the invention of speech-to-text — a relatively novel software revelation that allows users to text hands-free. But what if that software could be reversed in an effort to bring words to life for the visually impaired? That’s what researchers at the MIT Media Lab are trying to achieve through their new finger-mounted reading device that works to convert written text into synthesized speech in real time.
The new prototype uses software to provide feedback, either tactile or audible, that can guide the user’s finger along a line of text while reading out the corresponding words in real time. The group of MIT researchers created two variations of the device that will help guide the user’s fingers over the words as they read.
The first iteration used two haptic motors, one on top and one below the finger, to help guide the user’s finger via vibrations. When the motor on top of the finger vibrates, it indicates to the user to lower the device as they scroll across the page. When the motor beneath the finger vibrates, it indicates to the user that the tracking finger needs to be raised.
The second iteration uses audio feedback rather than haptic motors to help guide the tracking finger. This version uses a musical tone that increases in volume once the tracking finger begins to drift away from the line of text. So far, trials haven’t provided a clear indication from users over which iteration is more successful, but the researchers are concentrating on the model that uses audio feedback, as it allows for a smaller, lighter-weight sensor.
The new finger-mounted device is the latest out of MIT that aims to help the visually impaired through camera technologies. Last summer researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) developed a device that pairs a 3-D camera with vibrating motors and a braille interface to help the visually impaired identify and map their surrounding environments. The technology focuses on providing navigation and object identification abilities to the visually impaired using 3-D cameras that provide detailed information to the user.
Similarly, this new finger-mounted device aims to provide text identification abilities to visually impaired users in real time, regardless of the source of text. The key to the device translating words into audio in real time is an algorithm for processing the camera’s video feed that was specifically developed for the function of translating text to speech.
Each time a user places the device at the start of a new line, the algorithm makes a series of guesses about the baseline of the letters. Given each estimate of the baseline text, the algorithm also tracks each individual word as it slides past the camera. When it recognizes that a word is positioned near the center of the camera’s field of view, it crops the word out of the image. The image is then passed to an open-source software that can recognize the characters and translate the word into synthesized speech.
For now, the prototype works through a connection with a laptop where the algorithms can be executed. As the group moves forward with the work, the aim is to develop a version of the software that can run on a smartphone to make the device more portable.
Along with aiding the visually impaired, the group also believes the device could be useful for a variety of other applications, such as a tool for patients who struggle with dyslexia or reading comprehension. In time, researchers hope the device can not only help the visually impaired read text, but one day begins to receive information on objects in their surrounding environment as well. Until then, the group hopes to restore some sense of sight to the visually impaired — even if it’s just one word at a time.