A Google AI project is smart enough to detect real-world software vulnerabilities on its own, according to the company’s researchers.
Google’s AI program recently discovered a previously unknown and exploited bug in SQLite, an open source database engine. The company then reported the vulnerability before it reached an official software release, prompting SQLite to issue a fix last month.
“We believe this is the first public example of an AI agent finding a previously unknown exploit memory security problem in widely used real-world software,” Google security researchers wrote in a blog post on friday.
The news joins growing research showing that today’s large language models have the potential to find software vulnerabilities, potentially giving the tech industry a much-needed edge in protecting software against hackers.
This is not the first time an AI program has discovered flaws in software. In August, for example, another major language modeler called Atlantis discovered a particular bug in SQLite. Meanwhile, machine learning models, a subset of AI, have been used for years to also find potential vulnerabilities in software code.
However, Google says that the achievement with its AI program shows that it is possible for large language models to detect more complex errors before the software itself is released. “We think this is a promising path toward ultimately turning the tables and achieving an asymmetric advantage for defenders,” the company’s researchers wrote.
Google’s project was originally called “Project Naptime” before it became “The Big Sleep,” a joke about how the company’s researchers hope the AI program will become capable enough to let Google’s human researchers ” take regular naps” at work.
Big Sleep was specifically created with special tools intended “to mimic the workflow of a human security researcher” when examining the computer code of a specific program. Google also developed Big Sleep to look at variants of existing security flaws, which are often a recurring problem in today’s software that hackers will be eager to exploit.
Recommended by our Editors
“Recently, we decided to put our models and tools to the test by running our first extensive real-world variant analysis experiment on SQLite,” the Google researchers write. This included allowing Big Sleep to review recent changes made to the SQLite code base. Google’s AI agent was able to investigate causing the SQLite crash and crash to help it better understand and explain the problem through a root cause analysis.
As a result, Google researchers wrote: “When equipped with the right tools, current LLMs can conduct vulnerability research.” That said, the blog post acknowledges that a specialized debugging tool known as a “target-specific fuzzer,” which can inject random code into a program, would also have been effective at finding the same error in SQLite.
However, the company’s researchers conclude: “We hope that in the future this effort will lead to a significant advantage for defenders – with the potential not only to find crash test cases, but also to provide high-quality analysis of root causes, triage and screening and Fixing issues can be much cheaper and more effective in the future.”
Like what you’re reading?
Register for Security Watch newsletter for our best privacy and security stories delivered straight to your inbox.
This newsletter may contain advertisements, deals or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You can unsubscribe from newsletters at any time.