| A Logical Approach to Computer Security
by David Pescovitz
Professor Sanjit Seshia joined UC Berkeley's Department of Electrical Engineering and Computer Sciences last year after receiving his PhD in Computer Science from Carnegie Mellon University.
One of the biggest challenges of computer security is that the people who write viruses are smarter than the software used to detect the malicious code. In fact, "malware detectors," like virus-scanning software, aren't very intelligent at all. They simply look at whether the pattern in a particular piece of code, an email attachment for example, matches the signature of a known virus. This isn't a logical approach, says UC Berkeley computer scientist Sanjit Seshia. He means that literally. Seshia and his colleagues are using computational logic to detect the behavioral traits of viruses even if their maliciousness is well-hidden by their creators.
"My research is at the intersection of computational logic and dependable computing," says Seshia, who is affiliated with the Center for Information Technology Research in the Interest of Society (CITRIS). "How do we make computer systems much more reliable, secure available, and safe?"
Most computer users count on antivirus software to rid them of viruses and worms that are transmitted via email attachments or dodgy Internet downloads. At the heart of these software packages are systems that attempt to identify malware through a collection of rules and "signatures," code patterns that act as identifiers of new viruses. It's called a syntactic approach—the software looks for particular sequences, or syntax, of bits. As new viruses are identified, the makers of the software offer updates containing the signatures of the latest threats.
"It turns out that current virus scanners are quite easy to fool," says Seshia, who participates in UC Berkeley's Team for Research in Ubiquitous Secure Technology (TRUST) and the Center for Hybrid and Embedded Software Systems (CHESS). "Malware writers obfuscate their code so that a signature that works today won't work tomorrow for what's essentially the same virus. So anti-virus companies are always trying to track the latest variants of viruses and worms and making sure people download the latest signatures to keep their definitions of viruses up to date."
Rather than look for commonalities in syntax, Seshia and his collaborators are developing mathematical algorithms based on semantics, essentially the "meaning" of the code. Thanks to some semantic smarts, the algorithm can detect variations of malware even if the nasty bit of code is well hidden by those who wrote the virus. He's collaborating on the research with professor Somesh Jha and graduate student Mihai Christodorescu of the University of Wisconsin-Madison, and Carnegie Mellon University professors Randal Bryant and Dawn Song.
"We're trying to develop a more behavioral definition of what it means to be malicious," Seshia says. "Perhaps it deletes files on your hard drive or duplicates itself and emails copies to everyone in your address book."
In computer science, this kind of semantic check is an "undecidable problem." There's no algorithm that can solve it every single time. The beauty of the algorithm though is that it's good enough in most cases, and it can crank through the calculations in a reasonable amount of time.
First, the researchers defined the behavioral signature of a virus, the fragment of the program's instruction set that's indicative of malicious behavior. This is used to form a template, Seshia says, "that's general enough to represent a whole family of programs."
"The algorithm translates the malware detection problem into a set of problems expressed in mathematical logic, which are handed off to another algorithm called a 'decision procedure,' says Seshia, whose major contributions were in the design and implementation of this procedure.
The overall algorithm involves a comparison of the unknown set of instruction sequences to the template. The algorithm abstracts away all superfluous information, including any variables that may have been altered to hide the maliciousness. The bare-bones program is then checked against the template to see if any red flags are raised. In recent experiments, the approach correctly identified multiple variants of two worms using a single template.
"It may be a computationally-hard problem to solve, but in practice, a good encoding of the problem enables the algorithm to run reasonably fast so we can think of eventually putting it into commercial tools," Seshia says.
While the computational logic methods are well-suited to fight the battle against computer viruses, Seshia is also applying his methods to hardware. He's developing tools for computer aided design of integrated circuits that will enable engineers to better verify their designs long before the expensive fabrication process begins.
"All of this research is about checking that a computer system will work as expected," he says. "If a piece of code is malicious, you'd like to know that before you run it."
Sanjit Seshia's home page
UC Berkeley Center for Hybrid and Embedded Software Systems (CHESS)
Team for Research in Ubiquitous Secure Technology (TRUST)
Center for Information Technology Research in the Interest of Society
Lab Notes is
published online by the Marketing and Communications Office of the UC Berkeley
College of Engineering. The Lab Notes mission is to illuminate groundbreaking
research underway today at the College of Engineering that will
dramatically change our lives tomorrow.
Media contact: Teresa
Moore, Lab Notes editor, Director of Marketing and Communications
Writer, Researcher: David
Web Manager: Michele
Subscribe or send comments to the Engineering Marketing and Communications Office: firstname.lastname@example.org.
© 2006 UC Regents.