DeepMind AI Achieves Gold Medal Level in Math Olympiad

2025-02-11

The research team from Google's DeepMind project reported that their AlphaGeometry2 AI achieved gold medal-level performance when solving high school level problems from the International Mathematical Olympiad (IMO) over the past 25 years. The team published a paper on the arXiv preprint server, detailing AlphaGeometry2 and its scoring in solving IMO questions.
Overview of the search algorithm. Source: arXiv (2025). DOI: 10.48550/arxiv.2502.03544

Prior studies have shown that AI capable of solving geometry problems could lead to more complex applications since they require advanced reasoning skills and the ability to select steps from possible solutions.

To this end, the DeepMind team has been developing increasingly sophisticated geometric solvers. Their first version was released last January, named AlphaGeometry; the second version is called AlphaGeometry2.

The DeepMind team combined it with another system they developed, Alpha Proof, which conducts mathematical proofs. They found that it could solve four out of six problems listed in this summer's IMO. For this new study, the research team expanded the system's capability testing by providing multiple questions used in the IMO over the past 25 years.

The team constructed AlphaGeometry2 by integrating several core components, one of which is Google's Gemini language model. Other elements use mathematical rules to propose solutions or partial solutions to the original problem.

The team noted that to solve many IMO problems, certain constructions must be added before proceeding, meaning their system must be able to create them. Then, their system attempts to predict which constructions added to the diagram should be used for the reasoning required to solve the problem. AlphaGeometry2 suggests steps that might be used to solve a given problem and checks the logic of these steps before using them.

To test their system, researchers selected 45 questions from the IMO, some of which needed to be translated into a more usable format, resulting in a total of 50 questions. They reported that AlphaGeometry2 was able to correctly solve 42 of them, slightly above the average performance of human gold medalists in the competition.