Over the weekend, Neel Somani, who’s a software program engineer, former quant researcher, and a startup founder, was testing the mathematics abilities of OpenAI’s new mannequin when he made an surprising discovery. After pasting the issue into ChatGPT and letting it suppose for quarter-hour, he got here again to a full answer. He evaluated the proof and formalized it with a device known as Harmonic — however it all checked out.
“I used to be curious to ascertain a baseline for when LLMs are successfully in a position to clear up open math issues in comparison with the place they wrestle,” Somani mentioned. The shock was that, utilizing the most recent mannequin, the frontier began to push ahead a bit.
ChatGPT’s chain of thought is much more spectacular, rattling off mathematical axioms like Legendre’s formula, Bertrand’s postulate, and the Star of David theorum. Ultimately, the mannequin discovered a Math Overflow post from 2013, the place Harvard mathematician Noam Elkies had given a sublime answer to the same downside. However ChatGPT’s closing proof differed from Elkies’ work in necessary methods, and gave a extra full answer to a model of the issue posed by legendary mathematician Paul Erdős, whose huge assortment of unsolved issues has grow to be a proving floor for AI.
For anybody skeptical of machine intelligence, it’s a stunning end result — and it’s not the one one. AI instruments have grow to be ubiquitous in arithmetic, from formalization-oriented LLMs like Harmonic’s Aristotle to literature assessment instruments like OpenAI’s deep analysis. However for the reason that launch of GPT 5.2 — which Somani describes as “anecdotally extra expert at mathematical reasoning than earlier iterations” — the sheer quantity of solved issues has grow to be tough to disregard, elevating new questions on giant language fashions’ means to push the frontiers of human data.
Somani was trying on the Erdős issues, a set of over one thousand conjectures by the Hungarian mathematician which might be maintained online. The issues have grow to be a tempting goal for AI-driven arithmetic, various considerably in each material and issue. The primary batch of autonomous options got here in November from a Gemini-powered model called AlphaEvolve — however extra not too long ago, Somani and others have discovered GPT 5.2 to be remarkably adept with high-level math.
Since Christmas, 15 issues have been moved from “open” to “solved” on the Erdős web site — and 11 of the options have particularly credited AI fashions as concerned within the course of.
The revered mathematician Terence Tao has a extra nuanced take a look at the progress on his GitHub page, counting eight completely different issues the place AI fashions made significant autonomous progress on an Erdős downside, with six different instances the place progress was made by finding and constructing on earlier analysis. It’s a good distance from AI programs with the ability to do math with out human intervention, however it’s clear that there’s an necessary function for big fashions to play.
Techcrunch occasion
San Francisco
|
October 13-15, 2026
On Mastodon, Tao conjectured that the scalable nature of AI programs makes them “higher suited to being systematically utilized to the ‘lengthy tail’ of obscure Erdős issues, lots of which even have easy options.”
“As such, many of those simpler Erdős issues are actually extra prone to be solved by purely AI-based strategies than by human or hybrid means,” Tao continued.
One other driving drive is a current shift in the direction of formalization, a labor-intensive job that makes mathematical reasoning simpler to confirm and prolong. Formalization doesn’t require use of AI and even computer systems, however a brand new crop of automated instruments have made the method far simpler. The open-source “proof assistant” Lean, which was developed at Microsoft Analysis in 2013, has grow to be broadly used throughout the subject as a manner of formalizing proof— and AI instruments like Harmonic’s Aristotle promise to automate a lot of the work of formalization.
For Harmonic founder Tudor Achim, the sudden bounce in solved Erdős issues is much less necessary than the truth that the world’s best mathematicians are beginning to take these instruments critically. “I care extra about the truth that math and pc science professors are utilizing [AI tools],” Achim mentioned. “These individuals have reputations to guard, so once they’re saying they use Aristotle or they use ChatGPT, that’s actual proof.”

