Google DeepMind's AI Wins Silver Medal in International Mathematical Olympiad
Google DeepMind has unveiled an artificial intelligence system that achieved a silver medal level performance at this year's International Mathematical Olympiad (IMO). The hybrid system combines two specialized models called AlphaProof and AlphaGeometry 2, and solved four out of six problems in this prestigious competition, marking a new milestone in AI's mathematical reasoning capabilities.
The International Mathematical Olympiad is widely regarded as the most challenging math competition for university students and serves as a benchmark for evaluating advanced problem-solving abilities of artificial intelligence. Google DeepMind's system scored 28 out of 42 possible points, falling just one point short of the gold medal threshold of 29 points achieved by only 58 out of 609 participants.
At the core of this achievement is AlphaProof, a novel AI that combines pre-trained language models with the AlphaZero reinforcement learning algorithm, which is the same technology used to master complex games like chess and Go. This integration allows AlphaProof to approach mathematical reasoning with game-like strategies, searching for possible proof steps in a vast decision tree. Unlike previous methods constrained by scarce human-authored data, AlphaProof bridges the gap between natural language and formal language. It uses a fine-tuned version of Google's Gemini model to translate natural language questions into formal statements, creating a large training corpus.
As a complement to AlphaProof, AlphaGeometry 2 is an upgraded version of DeepMind's geometry solver. The system demonstrates remarkable efficiency, solving an IMO problem in just 19 seconds. Its performance improvement is attributed to enhanced training data and a novel knowledge sharing mechanism that makes problem-solving more complex. Prior to this year's competition, AlphaGeometry 2 achieved an 83% success rate in solving all historical IMO geometry problems from the past 25 years, a significant improvement compared to its predecessor's 53% success rate.
These problems were manually translated into formal mathematical language for AI systems to comprehend. While the official competition allows students to work on the problems in two sessions of 4.5 hours each, the AI system solved a problem within minutes, while other problems took up to three days to solve.
Google DeepMind's success at the International Mathematical Olympiad highlights the rapid progress of AI in solving advanced reasoning tasks. With the continuous development of these systems, they have the potential to accelerate scientific discoveries and push the boundaries of human knowledge.
Google DeepMind plans to release more technical details about AlphaProof soon and continue exploring various AI approaches to advance mathematical reasoning.