To this end, the DeepMind team has been developing increasingly sophisticated geometric solvers. Their first version was released last January, named AlphaGeometry; the second version is called AlphaGeometry2.
The DeepMind team combined it with another system they developed, Alpha Proof, which conducts mathematical proofs. They found that it could solve four out of six problems listed in this summer's IMO. For this new study, the research team expanded the system's capability testing by providing multiple questions used in the IMO over the past 25 years.
The team constructed AlphaGeometry2 by integrating several core components, one of which is Google's Gemini language model. Other elements use mathematical rules to propose solutions or partial solutions to the original problem.
The team noted that to solve many IMO problems, certain constructions must be added before proceeding, meaning their system must be able to create them. Then, their system attempts to predict which constructions added to the diagram should be used for the reasoning required to solve the problem. AlphaGeometry2 suggests steps that might be used to solve a given problem and checks the logic of these steps before using them.
To test their system, researchers selected 45 questions from the IMO, some of which needed to be translated into a more usable format, resulting in a total of 50 questions. They reported that AlphaGeometry2 was able to correctly solve 42 of them, slightly above the average performance of human gold medalists in the competition.