DeepMind's Latest AI System, AlphaGeometry, Solves Complex Geometry Problems

2024-01-18

Google DeepMind and researchers from the Department of Computer Science at New York University have launched AlphaGeometry, an artificial intelligence system that can solve college-level geometry problems at a level comparable to the world's smartest high school math geniuses. As described in a paper published in the journal Nature, AlphaGeometry demonstrated professional-level skills in a benchmark test consisting of 30 challenging classical geometry problems, similar to International Mathematical Olympiad (IMO) gold medalists.

This work represents a significant advancement in the frontier of artificial intelligence in the fields of mathematics and reasoning. AlphaGeometry correctly solved 25 out of 30 problems within the official Olympiad time limit. In comparison, previous state-of-the-art geometry solvers could only successfully solve 10 problems, while the average human gold medalist solved 25.9 problems.

Competing against highly trained young prodigies in mathematics competitions may seem like a narrowly defined challenge. However, geometry requires creativity and intuition in constructing valid proofs and understanding spatial relationships, which are core abilities of advanced reasoning.

So far, the journey of artificial intelligence in the field of mathematics has been rocky, especially in areas like geometry. Unlike other areas of mathematics, geometry relies on visual-spatial reasoning, and its unique translation challenges create data bottlenecks. Existing machine learning methods often fail here due to the scarcity of human proofs translated into machine-verifiable language.

So how does DeepMind's system achieve Olympiad-level skills? AlphaGeometry utilizes both neural language models and rule-based deductive engines. The neural language model provides intuitive suggestions for useful exploration paths, while the rule-based deductive engine rigorously checks logical proof steps in Euclidean geometry. The neural component focuses creativity on promising solutions, while symbolic logic provides mathematical correctness - a flexible pairing.

If you are familiar with Daniel Kahneman's book "Thinking, Fast and Slow," you can find similarities. One system provides quick "intuitive" ideas, while the other system provides more thoughtful rational decisions.

By using synthetic data (generating 100 million unique geometry problems and proofs), researchers were able to train the language model component of AlphaGeometry without expensive and limited human expert annotations. This near-infinite novelty challenge prompted the system to expand its knowledge.

The success of AlphaGeometry in solving Olympiad-level geometry problems is not just an academic achievement. It demonstrates the potential of artificial intelligence to engage in complex mathematical reasoning, a skill long considered a stronghold of human intelligence. While its current expertise is focused on geometry, these methods theoretically can be adapted to other areas of mathematics. Additionally, the system's ability to autonomously generate and solve complex problems paves the way for artificial intelligence applications in various fields, from engineering to theoretical research.

By open-sourcing the code and models of AlphaGeometry, DeepMind aims to provide a starting point for future projects - both internal and external - to further enhance the reasoning capabilities of artificial intelligence. Collaborative efforts between organizations may lead to AGI with stronger scientific, logical, and learning foundations in the future.

The researchers note that since each Olympiad has six problems, of which only two are typically focused on geometry, AlphaGeometry can only be applied to one-third of a given Olympiad's problems. However, its geometry capabilities make it the world's first artificial intelligence model to reach the bronze medal threshold of the IMO from 2000 to 2015.