GRASP Lab teaches robot to read

GRASP Reading Robot

It’s not so unusual to see robots rolling or flying around the General Robotics, Automation, Sensing and Perception (GRASP) Lab at Penn’s School of Engineering and Applied Science, but now there is a robot that reads as it rolls. (Pictured: Penn Engineering graduate student Menglong Zhu and Graspy)

It’s not so unusual to see robots rolling or flying around the General Robotics, Automation, Sensing and Perception (GRASP) Lab at Penn’s School of Engineering and Applied Science, but now there is a robot that reads as it rolls.

Research by graduate student Menglong Zhu and postdoctoral fellow Kosta Derpanis, along with advisor and GRASP Lab Director Kostas Daniliidis, has made Penn’s PR2 robot (named Graspy) the first of its kind to be literate. 

The PR2 is a customizable robot built by Willow Garage that has been distributed to institutions like Penn for use in research. While other GRASP teams work on Graspy’s ability to manipulate objects with its highly articulated arms and grippers, Zhu’s research focuses on cameras that serve as the robot’s eyes and the programming that serves as its brain.

The process that turns images of words into their digital equivalents, known as optical character recognition, or OCR, is not new technology. OCR currently allows Google to digitize books en masse and smartphones to translate signs into other languages.

But in those examples, humans are doing the hard part of the task: locating the words themselves.

“If you don’t tell the OCR where the words are, it will either output junk or nothing,” says Zhu. “You can’t just find one letter in the environment, you have to find whole words.”

Graspy locates words by looking for groups of close-together lines with similar widths and spacing, which tend to represent letters. The robot can then perform OCR on what it thinks are words, checking them against a customizable dictionary.

To improve its accuracy, Graspy uses an algorithm similar to those found in spellchecker programs. While those programs make assumptions about misspellings based on the proximity of letters on a keyboard and common human errors, Zhu’s program must account for Graspy viewing words at an angle, or words that are written in fonts that have two identical looking characters (such as an upper case “I” and a lowercase “l”).

Once the words are digitized, Graspy can read them aloud with a speech synthesizer.

The ability to find and digitize words in a human environment would make it easier for robots like Graspy to navigate through large spaces, the researchers say. Robots would be able to find directions to a certain room in a strange building much like a human would, by reading posted signs. 

That ability could also potentially help people with vision impairments, allowing them to wear a camera system that would read signs in front of them.

The reading robot has received local and national media attention, including coverage by Philadelphia WHYY's Newsworks site.  

Originally published on June 9, 2011