The Vision Thing,
Robots Can Do
Much of Bajcsys own research
at GRASP has focused on connecting machine perception with action. To
understand what that means for robots, consider the problem of computer
vision: How do robots look at things? Sixteen years ago, Bajcsy had the
idea that a robots visual perception, its knowledge of the environment,
would be improved if it could actively adapt to the scene around it. Bajcsy
offers this analogy with human vision: "We do not just see; we look.
Our pupils adjust to the level of illumination; our eyes bring the world
into sharp focus; we move our heads or change our position to get a better
view of something; and sometimes we even put on spectacles." Perceptsthose
impressions of objects that we collect with our sensesdont
just fall onto our eyes like rain falls onto the ground, Bajcsy points
out. The same should hold true for a sensor on a robot.
that work in practice? Think of a camera mounted on a robot. The robot
moves its head or its bodytheres the actionto help the
camera collect informationthats the perception. In turn the
cameras visual information is fed to a computer that controls the
robots motion: a tidy feedback loop. The overriding idea is that
actively changing a sensors parameters (like the cameras position
or focus) helps the robot adapt to an uncertain environment.
That link between perception
and action was a new way of looking at computer vision in the early 1980s.
In 1983 Bajcsy worked her theory into a new research paradigm that she
called "active perception." The concept has become Bajcsys
signature in the robotics community and just two years ago was a factor
in her election to the National Academy of Engineering.
In recent years, Bajcsy and
her co-workers in the GRASP lab have taken the concept of computer vision
a step further with another question: How can you use computer vision
to gather information about the objects in a three-dimensional world?
The answer to this question is making reality out of "tele-immersion,"
a new technology that seems the fabric of fantasy. Using tele-immersion,
you could have a meeting with people scattered across the country or across
the globe, and each of you would feel as if you were in the same physical
space. Youd be in a shared, simulated environment, but the person
you "see" across the room would look real, not like an animated
image. Real enough to make out the textures and vagaries of hair and skin
and clothes. Real enough to touch.
The principle that will make
this technology work is called "stereo-reconstruction." In each
of the remote locations, a set of video camerasat least twotakes
pictures of the people and the objects in the room, capturing movements
and measuring distances from the camera. The next step is putting together
a three-dimensional reconstruction of the scene, incorporating changes
as quickly as they happen. For that the GRASP tele-immersion group is
refining a special algorithm that recovers all of the image information
from the cameras. Then all the reconstructions can be projected into a
virtual world. Or the 3-D reconstructions can be integrated into a real
environment for a mind-bending taste of mixed reality. (Visit the labs
Web site at www.cis.upenn.edu/ ~grasp/research.htm for a sample.)
Dr. Kostas Daniilidis, Bajcsys
colleague in this project, says engineers could use tele-immersion in
the collaborative design of mechanical parts. He sees other applications
as wellfor example, in the entertainment industry: A dancer in Chicago
and a dancer in New York could train together in a virtual ballroom. But
those are in the future. For now, tele-immersion is seen as a test. The
GRASP lab is part of the National Tele-Immersion Initiative, an academic
consortium with plans to use tele-immersion as the toughest available
technical challenge for Internet2, the university-led research and education
network meant to develop pre-commercial technologies and enhance the federal
governments Next Generation Internet. One reason tele-immersion
will be such a good test of Internet2 is that it draws on a very wide
bandwidth (data-transfer rate) that current Internet technology cant
The reconstruction issue
comes up again in another project of Bajcsys, this one on cooperative
systems, which looks at how robots interact with each other and with people.
Consider, say, a team of robots marching through an unknown environment.
"The question is how much autonomy you want to give a system when
the different members are supposed to cooperate," Bajcsy says. "Its
like in the Boy Scouts: Meet me at place X. How much do you need to communicate
to get there? And what if one of your members gets stuck? What do you
Bajcsy has developed some
mathematical models of this cooperation and autonomy and implemented them
in the robots. But smoothing the robots paths to point X calls for
a very accurate reconstruction of the environment by their sensors. And
the familiar duet of action and perception comes into play here too. It
turns out that different "perception strategies" will apply
in different dynamic situations; in other words, the robot has to figure
out the best way to avoid running into a wall.
Using motion parallaxthe
apparent change in an objects position when you view it from two
different spotsworks pretty well in open spaces. For a robot making
a turn, stereothe use of two cameras that work like a pair of eyesis
a better way to keep from tripping over obstacles. Our own biological
systems work in much the same way. The challenge with a robot, says Bajcsy,
is designing an automatic switch that, depending on the changing environment,
will select the best perception strategy to help the robot reach its goal.
"Once we have good perception," she adds, "the control
of the robots is easy."
Heres one example of
what a cooperative system has actually done in the GRASP lab: two mobile
robots work together to carry a large objecta box or a pipe, sayand
move across a room. As they move, first one robot leads the way; then
they change their configuration and move side by side; then one robot
takes the lead again. And all the while they hold the box together and
traverse obstacles in their path.
Understanding active cooperation
like this is important, says Bajcsy, because it will help us understand
what makes intelligent behavior. When computer scientists talk about making
intelligent machines, they use the term "artificial intelligence."
Bajcsy views artificial intelligence as a discipline that tries to understand
what intelligence really is and includes "a whole gamut of activities,"
like perception, action, reasoning and learning. AI does not mean copying
humans, she believes: "Machines can do some activities better than
humans. They can multiply faster, do mathematical operations, remember
more. But humans are more flexible, and thats the puzzle. How do
we understand this flexibility?
Bajcsys mentor at Stanford,
John McCarthy, agrees that you cant point to one thing that will
answer the question of whether or not a machine is intelligent. Though
we define intelligence by relating it to human capabilities, "You
have to ask which aspects of intelligence have been understood well enough
to put into computer programs," he says. "AI researchers are
free to use methods that are not observed in people or that involve much
more computing than people can do."
Most of us would probably
agree that a machine can be called intelligent at some level. But theres
another element of human intelligence called consciousness. Can machines
have that? Thats a murky area, says Bajcsy, and one that the philosopher
Daniel Dennett, author of Consciousness Explained, among other
books, would call a frontier of science.
AI was built on the premise
that we can model and implement whatever is rational behavior, she says.
Now researchers are making progress toward trying to understand the emotional
aspect. And thats where consciousness comes in: "We all know
that your emotional state can influence your rational behavior. But how
do you put it into mathematical or formal terms; thats really the
question. When you cry or smile, we can measure your heartbeat, your perspiration,
your temperature. All the lie detectors, for example, are based on this.
So with emotions we have all these indirect observables. But are they
the right observables? That is really the issuethat we cannot crawl
inside you and measure what is going on."
McCarthy views consciousness
from a different angle. Conscious machines, he says, will need the power
of self-observation: "Computer programs will have to be able
to look at themselves and evaluate their own intentions. Theyll
have to decide whether their line of thinking is successful or unsuccessful,
the way people do."