Lawson Wong and his mentors at MIT haven't made plans to apply their robotics research advances to medical technology, but Wong can see where it might work.
Image courtesy of Christine Daniloff and Jose-Luis Olivares/MIT
The researchers at MIT's Computer Science and Artificial Intelligence Laboratory are working on making household robots better able to identify and manipulate objects in the home--the sort of thing that can help people with visual impairments.
The team used an off-the-shelf algorithm to show that a robot can recognize four times as many objects as one that uses a single perspective, while reducing the number of misidentifications. A paper on the study is scheduled to appear in a forthcoming issue of the International Journal of Robotics Research, according to a statement from MIT.
"There is different software for just detecting an object at this point on the table," said Wong, a PhD candidate in computer science. "These software pieces tend to have quite a bit of error in them."
Wong and his thesis advisors, Leslie Kaelbling, the Panasonic Professor of Computer Science and Engineering, and Tomás Lozano-Pérez, the School of Engineering Professor of Teaching Excellence, presented a different algorithm to speed up the identification process and make it more practical for in-home use.
They mounted a Microsoft Connect camera atop the robot to capture different images based upon the algorithms they had used to program the robot. They then used an imaging matching system to help the robot identify the objects.
We're not talking about just a cup, a spoon and a pair of salt-and-pepper shakers. The researchers considered scenarios in which they had 20 to 30 different images of household objects clustered together on a table. In several scenarios, the clusters included multiple instances of the same object, closely packed together, which makes the task of matching different perspectives more difficult.
To keep the required number of samples low, the researchers adopted a simplified technique for evaluating hypotheses. The most mathematically precise way to compare hypotheses would be to consider every possible set of matches between the two groups of objects. If you include the possibilities that the detector has made an error and that some objects are occluded from some views, that approach would yield 304 different sets of matches.
To improve accuracy, the researchers' algorithm considers each object in one group separately and evaluates its likelihood of mapping onto an object in a second group. If there are four objects in each group, object 1 in the first group could map onto objects 1, 2, 3, or 4 in the second, as could object 2 in the first group, and so on, the MIT statement explained. With the possibilities of error and blocked views factored in, this approach requires only 20 comparisons.
The researchers' algorithm also looks for double mappings (mismatching items from two groups) and re-evaluates them. That takes extra time, but not nearly as much as considering aggregate mappings would. In this case, the algorithm would perform 32 comparisons--more than 20, but significantly less than 304.
"The robot needs to know what is it that it is faced with in the world," Wong explained. "To cook a meal, where's the pots and pans, where's the stuff in the fridge? It's all about building up a representation of the world...
"I could see that being useful for limited vision," Wong continued. "I think that some people have already been trying to apply object or text detectors that specifically detect text on signs that can be read back to people with vision impairments. To have it be possible for visual data would be great as well."
|Refresh your medical device industry knowledge at MD&M West, in Anaheim, CA, February 10-12, 2015.|
Nancy Crotti is a contributor to Qmed and MPMN.
Like what you're reading? Subscribe to our daily e-newsletter.