OpenAI today announced an improved version of its most capable AI model to date — one that takes even more time to consider questions — just a day after Google announced its first model of the type.
The new OpenAI model, called o3, replaces o1, which the company introduced in September. Like o1, the new model spends time thinking about the problem so that it can provide better answers to questions that require step-by-step logical reasoning. (OpenAI decided to skip the “o2” moniker as it is already the name of a UK mobile operator.)
“We see this as the beginning of the next phase of AI,” OpenAI CEO Sam Altman said in a livestream on Friday. “Where you can use these models to do increasingly complex tasks that require a lot of reasoning.”
OpenAI says the o3 model scores much higher than its predecessor on several measures, including those that measure complex coding-related skills and advanced math and science abilities. It is three times better than o1 in answering questions posed by ARC-AGI, a benchmark designed to test the ability of artificial intelligence models to reason about extremely difficult mathematical and logical problems they encounter for the first time.
Google is doing a similar line of research. Google researcher Noam Shazeer revealed in a post on X yesterday that the company has developed its own thinking model called Gemini 2.0 Flash Thinking. In a post, Google CEO Sundar Pichai called it “the most thoughtful model yet.” Google’s new model scored high in the SWE-Bench test, which measures the agent capabilities of models.
However, OpenAI’s new o3 model is 20 percent better than o1. “o3 blew it out of the water,” says Ofir Press, a postdoctoral researcher at Princeton University who helped develop SWE-Bench. “Very surprising increase, not sure how they did it.
These two dueling models show that the competition between OpenAI and Google is fiercer than ever. It is crucial for OpenAI to demonstrate that it can continuously make progress in order to attract more investment and build a profitable business. Google, meanwhile, is desperate to show it remains at the forefront of AI research.
The new models also show how AI companies are increasingly looking beyond simply scaling AI models to squeeze more intelligence out of them.