Vision and Motion
by David Pescovitz
Printer-friendly
version
UC
Berkeley professor Jitendra Malik, Associate Chair of the
Computer Science Division
|
UC Berkeley professor
Jitendra Malik, associate chair of the computer science division,
has become an expert in the graceful human dynamics of ballet. Malik
is not a male ballerina though, nor a particularly big fan of classical
dance. He's had to study the subtleties of plies and releves and
a host of other human motions in order to teach a computer to identify
what a person is doing just by watching him or her. His novel approach
to computational analysis of human movement has myriad applications,
from ultra-realistic videogames, where the players control human
actors on the screen, to surveillance.
"A big aspect of human intelligence is vision - how we, using
our eyes, understand the world around us," says Malik, who
is also a researcher with the Center for Information Technology
Research in the Interest of Society (CITRIS). "But it's not
just that we need to recognize a tiger in front of us. We need
to recognize that the tiger is jumping toward us."
To empower computers with that same capability, Malik and graduate
students Alyosha Efros, Greg Mori and Alex Berg have developed a
software system that instantly classifies the videotaped actions
of a human figure in various settings, from the World Cup to a ballet
recital. They've also used their system to create a video sequence
where a World Cup player appears to mimic the motion of one of the
graduate students showing off his soccer skills on videotape. The
researchers will present their results in a scientific paper at
the IEEE International Conference on Computer Vision in France next
month.
Here's how the underlying technology works: The researchers provide
the computer with a digitized video clip of, for example, a televised
soccer game. Even though the players are quite small, the pattern
of motion--a run toward the goal, for instance is easily identifiable
by the cyclic motions of the hands and legs.
The software captures that pattern and computes the "optical
flow," the local movement of each pixel in the part of the
image being analyzed. Those optical flow vectors are then compared
to the data in a library of predetermined patterns, representing
walks, jumps, ballet movements, and other human motions shot from
a variety of angles. "The computer finds the best match and
identifies the action," Malik says.
|
|
Given
a video input sequence of frames depicting a soccer
player (above), the software identifies the most closely
matching frames in a database of 3D motion capture
data (below). The motion capture data is rendered
from many viewing directions using a stick figure.
Image courtesy the researchers.
|
|
While the optical flow technique is ideal for wide shots where
the action may be far away, the researchers also developed a related
technique for close-ups. In this version, the computer detects
the outlines of the person and, Malik says, "fits the equivalent
of a human skeleton inside." The software then observes how
the motion of the skeleton evolves over time. Because the computer
can shift the skeleton in three-dimensions, the database does
not need to contain multiple angles of the same movement. This
technique, Malik says, could someday enhance computer speech recognition
systems with "gesture recognition."
Of course, computer recognition of human motion has surveillance
applications as well. According to Malik, speedier hardware could
potentially enable his system to detect burglaries or crimes in
progress. On the civilian side, he adds, it could also be used to
monitor swimming pools, keeping a constant vigil for actions that
are consistent with drowning.
More than surveillance though, Malik is particularly excited about
how computational motion analysis and "Do As I Say" could
be applied by the entertainment industry. For example, a videogame
player could potentially control the motion of a human "character"
using a joystick. Or, the researchers write in their scientific
paper, a filmmaker might "collect a large database of, say,
Charlie Chaplin footage and then be able to 'direct' him in a new
movie."
Jitendra
Malik's Home Page
Recognizing
Humans and Their Activities
UC
Berkeley Computer Vision Group
Center for
Information Technology Research in the Interest of Society (CITRIS)
CITRIS Video
(6:05)
Lab Notes is
published online by the Public Affairs Office of the UC Berkeley
College of Engineering. The Lab Notes mission is to illuminate groundbreaking
research underway today at the College of Engineering that will
dramatically change our lives tomorrow.
Media contact: Teresa
Moore, Lab Notes editor, Director of Public Affairs
Writer, Researcher: David
Pescovitz
Web Manager: Michele
Foley
Subscribe or send comments to the Engineering Public Affairs
Office: lab-notes@coe.berkeley.edu.
© 2003 UC Regents.
Updated 8/29/03.
|