In a historic breakthrough for artificial intelligence, advanced models developed by Google DeepMind and OpenAI have achieved gold-medal performance at the prestigious International Mathematical Olympiad (IMO), showcasing remarkable progress in machine reasoning and natural-language problem-solving.
The 66th edition of the IMO was held on the Sunshine Coast in Queensland, Australia, attracting 630 of the world’s top young mathematical minds. Out of these participants, 67 individuals earned gold medals. Notably, Google’s Gemini Deep Think model formally entered and succeeded in solving five out of six complex mathematical problems—matching the gold medal threshold for human competitors.
OpenAI, while not officially entered in the competition, conducted internal tests using the same IMO problems and achieved similar gold-level performance. Their experimental model employed a method known as test-time compute, which allows multiple chains of reasoning to be explored in parallel. Though unofficial, OpenAI released their results immediately after the contest, citing confidence in the model’s performance.
This achievement marks the first time general-purpose AI language models have demonstrated sustained, high-level mathematical reasoning in natural language under official competition conditions. It represents a sharp departure from earlier AI systems that relied primarily on symbolic logic or formal language-based solvers.
Experts from both academia and the AI industry have hailed the milestone as a turning point. Junehyuk Jung, a researcher affiliated with Brown University and DeepMind, emphasized that AI’s growing ability to reason in natural language could soon unlock collaboration between machines and human researchers in solving real scientific problems.
Sam Altman, CEO of OpenAI, noted that this was a goal envisioned years ago and now finally realized. However, he and other leaders in the field acknowledged that these models will not be made public immediately due to safety, scalability, and validation concerns. OpenAI expects a few months’ delay before releasing a usable version.
The competition builds on a series of recent successes in AI mathematical reasoning, including DeepMind’s earlier AlphaGeometry and other high-performing systems. However, challenges remain. Despite surpassing human benchmarks in some competitions, many AI models still struggle to fully articulate rigorous proofs or complete multistep logical chains, particularly in open-ended problems.
Nonetheless, the success at the IMO represents a major step forward in AI capabilities. With the potential to extend into domains such as physics, computer science, and engineering, these models may soon become powerful tools for scientific discovery.
While caution remains about how these models will be deployed, the achievement signals a promising new era where AI doesn’t just answer questions—it reasons, solves, and thinks at levels once thought to be uniquely human.