in the News
by David Pescovitz
Miller's interest in computer vision was sparked when
she spent a summer at the University of Florida working
on an eye tracking system for a hands-free computer mouse
as part of the National Science Foundation's prestigious
Research Experiences for Undergraduates (REU) program.
(David Pescovitz photo)
The human face has 80 muscles that work in tandem to create a seemingly
infinite array of expressions that dramatically change the way we
look from moment to moment. While humans have a relatively easy time
matching our contorting faces to names, computers are notoriously
bad at it. If software could automatically and accurately identify
people's faces though, myriad applications emerge--from intelligent
surveillance systems to software that helps us navigate massive collections
of photographs. UC Berkeley PhD candidate Tamara Miller and computer
science professor David Forsyth are tackling the latter in an effort
to advance the science of computer face recognition as a whole.
The researchers developed a system that automatically associates
45,000 face images culled from online news articles with the names
of the individuals in the photos. In their current demonstration,
a user is presented with a cluster of photos depicting a single individual--top
United Nations weapons inspector Hans Blix, for instance. The more
someone appears in the news, the larger the cluster of images. Clicking
on a particular photo links the image to its associated news article.
"The system enables you to browse the news by faces and bring
up articles related to the people you see," Miller says.
The software is remarkably adept at identifying dozens of images
of, say, Colin Powell even when the photos depict the Secretary of
State from a variety of angles, under different lighting conditions,
and with dozens of very different expressions on his face.
"Most photos in the news aren't mug shots with the person looking
right into the camera," says Forsyth, a researcher with the
Center for Information Technology Research in the Interest of Society
(CITRIS). "People do all kinds of remarkable things with their
faces. For example, we have piles of photographs of George Bush
biting his upper lip when he's nervous."
This figure shows a representative set of face/name clusters. The picture greatly exaggerates the error rate in order to show interesting phenomena and all the types of error the researchers encounter. For example, the clustering is effective even with individuals who wear moustaches (John Bolton) and eyeglasses (Hans Blix). Yet the James Bond cluster depicts photographs of an actor who portrayed the character and, erroneously, another individual who played the role of a villain in one film. (courtesy the researchers)
One potential application of the technology is a tool that automatically
organizes and enables easy searches of photographic archives without
depending solely on text annotations. Also, while the Berkeley research
is not focused on surveillance, Forsyth imagines it could lead to
a system that analyzes video footage taken during or before a criminal
activity to flag possible suspects.
In a recent scientific
paper, Forsyth, Miller, and their colleagues report that the system
is correct 95% of the time. Sometimes though, "one
innocent error by the program could cause considerable offense," Forsyth
says. For example, due to a mistake extracting names from the caption,
the system incorrectly labeled a photo of the German Justice Minister
as Adolf Hitler.
The process of linking a massive collection of faces with names
begins with extracting the faces from the rest of a photograph.
written by Miller then corrects, or rectifies, the position
of each face so that it matches a "canonical" pose
that can be compared with other faces. The rectifying software
on the Millennium
Cluster, a CITRIS testbed of more than 1,000 individual PCs
that work in parallel to solve computationally-intensive problems.
"The rectifying software finds the eyes, nose, and mouth and
conducts the transformation between the original and the canonical
The identification process is helped along by extracting names from
the captions that accompany the photos. Labeling the photos based
only on the captions is not possible though because, for instance,
there may be several people in a particular photograph. While humans
can determine by the caption who is who, computers are tripped up
by the syntax of the text. Instead, each face in a photo is associated
with all of the names in a caption. Then the computer compares the
face with already established clusters of named faces to statistically
determine if it tagged it with the correct name.
While the kinks of the software are still being worked out, including
its inability to label faces photographed in profile, the development
of a massive image database that can be automatically labeled is
a leap forward for computer face recognition.
"One problem in face recognition research is that the experimental
datasets of images that people use are often very different from
the real world," says Forsyth, a researchers with the Center
for Information Research in the Interest of Society. "It's a
bit like studying animal behavior in a zoo. You can do it, but you
can never be certain about what you've learned. Our dataset is more
realistic because it contains faces captured 'in the wild.'"
in the News" (scientific paper)
Forsyth's Home Page
Vision: A Modern Approach" by David Forsyth
for Information Technology Research in the Interest of Society (CITRIS)
Art Collections, Bit by Bit" by David Pescovitz (Lab Notes,
Lab Notes is
published online by the Public Affairs Office of the UC Berkeley
College of Engineering. The Lab Notes mission is to illuminate groundbreaking
research underway today at the College of Engineering that will
dramatically change our lives tomorrow.
Media contact: Teresa
Moore, Lab Notes editor, Director of Public Affairs
Writer, Researcher: David
Web Manager: Michele
Subscribe or send comments to the Engineering Public Affairs
© 2004 UC Regents.