Google DeepMind announced AlphaProof Nexus on May 26, a reasoning model that has autonomously solved 9 of 353 open Erdős problems, including questions that had been unanswered for decades. The Decoder reported the results based on DeepMind’s release, with inference cost per problem in the low hundreds of dollars range.
The numbers warrant precision. A 2.5% solve rate on the Erdős open-problem set is a modest absolute figure, but the Erdős collection is exceptional: it represents specific questions that the most prolific mathematician of the 20th century identified as both important and tractable enough to attempt, and that the field has nonetheless not closed in the decades since. The nine solved problems include questions related to combinatorics, number theory, and graph theory, with several having been open for more than 40 years.
The cost-per-solve figure is the operationally interesting number. At a few hundred dollars per problem in inference, the economics of throwing AlphaProof Nexus at the remaining 344 open problems are within the budget of any university math department or small foundation. The previous generation, AlphaProof, demonstrated the capability to match International Mathematical Olympiad performance in 2024. Nexus extends that capability to research-level problems while compressing the cost to something a graduate student’s discretionary budget could fund.
The comparison to OpenAI’s geometry conjecture disproof we covered earlier in May is direct. OpenAI’s result resolved one named problem (the planar unit-distance bound) using algebraic number theory techniques. DeepMind’s result clears nine problems across multiple subfields. The two announcements together indicate that frontier reasoning models are now reliably contributing to original mathematics at a rate that the academic literature has not yet caught up to. The peer-review system was designed for human submission cadence, and it is being asked to absorb output at AI-scale.
Structural skepticism applies. DeepMind is publishing this result, framing it favorably, and selecting the problems to highlight. The 9 solved problems were chosen for the announcement; the other 344 attempts presumably did not reach a verified proof. The cost-per-solve figure is the cost of the successful runs, not the total compute spent across all attempted problems including failures. The denominator for the true cost-per-solution is presumably larger.
External verification matters here. The proofs produced by AlphaProof Nexus are submitted to mathematicians for review, the same way OpenAI’s geometry result was reviewed by nine credentialed mathematicians. Whether the AlphaProof Nexus proofs survive expert scrutiny at the same rate is a question that will resolve over the next several months as the mathematical community engages with the specific results.
For research labs and university math departments evaluating whether to incorporate AI reasoning models into their open-problem workflows, the cost curve has just moved meaningfully. Running a frontier reasoning model against a candidate problem is now in the budget of any active research group. The expected output is not a guaranteed proof, but the chance of a partial advance or a useful sub-lemma is high enough that not running the model is a defensible choice only if your time is more valuable than the inference cost. That ratio has crossed for most academic settings.
Reported by The Decoder on 2026-05-26.