Berkeley Engineering Home
Volume 3, Issue 7
September 2003


Subscribe to
Lab Notes now!


In This Issue
Robugs: Smart Dust Has Legs

Vision and Motion

Touching the Future of Virtual Reality

The Birth of Bioproduction at UC Berkeley

1962: Graduation of David N. Kennedy, California's long-time "Water Czar"

Dean's Digest

Lab Notes Update

Archives 2003
2002
2001


coe.berkeley.edu
Lab Notes, Research from the College of Engineering

Vision and Motion
by David Pescovitz

Printer-friendly versionPrinter-friendly version

Professor Malik

UC Berkeley professor Jitendra Malik, Associate Chair of the Computer Science Division

UC Berkeley professor Jitendra Malik, associate chair of the computer science division, has become an expert in the graceful human dynamics of ballet. Malik is not a male ballerina though, nor a particularly big fan of classical dance. He's had to study the subtleties of plies and releves and a host of other human motions in order to teach a computer to identify what a person is doing just by watching him or her. His novel approach to computational analysis of human movement has myriad applications, from ultra-realistic videogames, where the players control human actors on the screen, to surveillance.

"A big aspect of human intelligence is vision - how we, using our eyes, understand the world around us," says Malik, who is also a researcher with the Center for Information Technology Research in the Interest of Society (CITRIS). "But it's not just that we need to recognize a tiger in front of us. We need to recognize that the tiger is jumping toward us."

To empower computers with that same capability, Malik and graduate students Alyosha Efros, Greg Mori and Alex Berg have developed a software system that instantly classifies the videotaped actions of a human figure in various settings, from the World Cup to a ballet recital. They've also used their system to create a video sequence where a World Cup player appears to mimic the motion of one of the graduate students showing off his soccer skills on videotape. The researchers will present their results in a scientific paper at the IEEE International Conference on Computer Vision in France next month.

Here's how the underlying technology works: The researchers provide the computer with a digitized video clip of, for example, a televised soccer game. Even though the players are quite small, the pattern of motion--a run toward the goal, for instance— is easily identifiable by the cyclic motions of the hands and legs.

The software captures that pattern and computes the "optical flow," the local movement of each pixel in the part of the image being analyzed. Those optical flow vectors are then compared to the data in a library of predetermined patterns, representing walks, jumps, ballet movements, and other human motions shot from a variety of angles. "The computer finds the best match and identifies the action," Malik says.

3D motion capture data

Given a video input sequence of frames depicting a soccer player (above), the software identifies the most closely matching frames in a database of 3D motion capture data (below). The motion capture data is rendered from many viewing directions using a stick figure.
Image courtesy the researchers.



While the optical flow technique is ideal for wide shots where the action may be far away, the researchers also developed a related technique for close-ups. In this version, the computer detects the outlines of the person and, Malik says, "fits the equivalent of a human skeleton inside." The software then observes how the motion of the skeleton evolves over time. Because the computer can shift the skeleton in three-dimensions, the database does not need to contain multiple angles of the same movement. This technique, Malik says, could someday enhance computer speech recognition systems with "gesture recognition."

Your Turn

How do you think computer recognition of human motion will impact your community?

We want to hear from you...

Of course, computer recognition of human motion has surveillance applications as well. According to Malik, speedier hardware could potentially enable his system to detect burglaries or crimes in progress. On the civilian side, he adds, it could also be used to monitor swimming pools, keeping a constant vigil for actions that are consistent with drowning.

More than surveillance though, Malik is particularly excited about how computational motion analysis and "Do As I Say" could be applied by the entertainment industry. For example, a videogame player could potentially control the motion of a human "character" using a joystick. Or, the researchers write in their scientific paper, a filmmaker might "collect a large database of, say, Charlie Chaplin footage and then be able to 'direct' him in a new movie."


Related Sites

Jitendra Malik's Home Page

Recognizing Humans and Their Activities

UC Berkeley Computer Vision Group

Center for Information Technology Research in the Interest of Society (CITRIS)

CITRIS Video (6:05)


Lab Notes is published online by the Public Affairs Office of the UC Berkeley College of Engineering. The Lab Notes mission is to illuminate groundbreaking research underway today at the College of Engineering that will dramatically change our lives tomorrow.

Media contact: Teresa Moore, Lab Notes editor, Director of Public Affairs
Writer, Researcher: David Pescovitz
Web Manager: Michele Foley

Subscribe or send comments to the Engineering Public Affairs Office: lab-notes@coe.berkeley.edu.

© 2003 UC Regents. Updated 8/29/03.