Google DeepMind Achieves Gold-Level Math Olympiad Performance, Matching OpenAI

Contents

The race for the best maths AI turns dirty Unofficial grading Premature announcements Gemini Deep Think’s result represents a huge leap in math capabilities

A hand triumphantly holds a gold medal aloft against a backdrop of clear blue sky. Image: Envato By mohdizzuanbinroslan

Google DeepMind’s latest Gemini Deep Think artificial intelligence model has achieved gold-medal level performance in the International Mathematical Olympiad.

The IMO is known as the most prestigious and challenging mathematics competition for high school students in the world. Only about 10% of this year’s competitors received gold medals, and numerous Fields Medalists have won it in the past.

Various AI companies tried out the IMO 2025 questions on their models, hoping to get top scores and impress the brightest researchers, who likely have a background in competitive mathematics. Unfortunately for Google, its gold-medal result matched that of its rival OpenAI – both models solved five out of the six questions, scoring 35 out of a possible 42 points – demonstrating it’s a tight race for AI supremacy.

The race for the best maths AI turns dirty

Unofficial grading

Perhaps to distract from this reality, Google’s team has been making digs at the ChatGPT-maker’s achievement. According to DeepMind researcher Thang Luong, as well as OpenAI’s former CTO Mikhail Samin, its model was not graded based on the International Mathematical Olympiad’s official guidelines, and thus its claims to be a gold medallist are not verifiable.

Senior OpenAI researcher Noam Brown posted on X that the Olympiad reached out to his company to participate in a non-natural language version of the competition, but it declined as it was prioritising its work on natural language systems.

While it ultimately chose to try out the questions on one of its unreleased models, OpenAI researcher Alexander Wei said on X that three former medallists independently graded its answers and reached a “unanimous consensus” on their scores.

In the introduction to its announcement, DeepMind made sure to highlight that it was among “an inaugural cohort to have our model results officially graded and certified by IMO coordinators using the same criteria as for student solutions.”

Premature announcements

But the scoring wasn’t the only aspect Google took issue with. OpenAI revealed its gold-medal result on Saturday morning, just hours after the Olympiad had made the high school winners public the evening prior. It was apparently a requirement for all formal AI participants to wait until a certain amount of time had passed after the announcement of the human results; some say this was a week, others say it was ten days.

Brown said on X that the Olympiad’s organisers told him OpenAI simply needed to wait until after the high schoolers’ results were made public on Friday night to unveil its results. Nevertheless, the Google DeepMind announcement was sure to highlight that it was respectfully “recognizing the significant accomplishments of this year’s student-participants” by waiting until Monday.

Gemini Deep Think’s result represents a huge leap in math capabilities

The version of Gemini Deep Think that tackled the International Mathematical Olympiad is an “enhanced reasoning model,” meaning it is designed to solve complex problems by mimicking human-like, step-by-step logic.

It has been trained on novel reinforcement learning techniques, a host of solutions to math problems, and some hints on how to approach those set by the Olympiad. Showing your work is a requirement of the competition, as well as a correct answer.

While its gold-medal result is not distinct from that of OpenAI’s model, it does mark a notable jump in math capabilities. In July 2024, a combined version of DeepMind’s AlphaProof and AlphaGeometry 2 AI systems achieved only a silver-medal standard with a score of 28, and the questions needed to be translated from natural language into domain-specific languages before it could tackle them. Such translation was not required with Deep Think.

“We believe agents that combine natural language fluency with rigorous reasoning – including verified reasoning in formal languages – will become invaluable tools for mathematicians, scientists, engineers, and researchers, helping us advance human knowledge on the path to AGI,” DeepMind researchers wrote in their announcement.

Google says it will first roll out this model to trusted testers, including mathematicians, before expanding access to subscribers of the $250-per-month Google AI Ultra plan. It has not yet given any specific dates.

Last week, Google announced that the model behind its AI Search Mode can be upgraded to Gemini 2.5 Pro, giving it new math capabilities.

Archives

Categories

Google DeepMind Achieves Gold-Level Math Olympiad Performance, Matching OpenAI

The race for the best maths AI turns dirty

Unofficial grading

Premature announcements

Gemini Deep Think’s result represents a huge leap in math capabilities

Leave a Review Cancel reply

Recent Posts

Recent Comments

Archives

Categories

The race for the best maths AI turns dirty

Unofficial grading

Premature announcements

Gemini Deep Think’s result represents a huge leap in math capabilities

Leave a Review Cancel reply

Recent Posts

Recent Comments

You Might Also Like

IT Leader’s Guide to the Metaverse

State of AI Adoption in Financial Services: A TechRepublic Exclusive

AI Underperforms in Reality, and the Stock Market is Feeling It

Google Shows Off Pixel 10 Series and Pixel Watch 4

NVIDIA & NSF to Build Fully Open AI Models for Science