“Math Is Cooked.” A Physicist Said It Twice. Mathematicians Are Still Arguing About It.
An OpenAI model just disproved an 80-year-old conjecture. Then a Princeton mathematician beat it by a factor of a billion in a single weekend. Here’s what actually happened and means for anyone.
A few weeks ago, MIT physicist Alexander Wissner-Gross said something blunt enough to repeat twice in the same breath: math, in his words, is finished. He wasn’t alone in thinking it. But the reaction that followed split the mathematics world into two camps so sharply that sixteen mathematicians from fifteen universities eventually felt compelled to write a formal declaration about what AI is allowed to do to their discipline.
So, is mathematics actually finished? After looking closely at what happened over the past six weeks, the honest answer is more interesting than either side of that argument wants it to be.
The Result That Started Everything
The rapid development of the underlying capability must be understood to comprehend the intensity of this debate. In 2022, large language models struggled with basic addition. By 2024, they were solving Olympiad-level problems, the mathematical equivalent of an elite athletic competition. By 2025, they were attempting genuine research questions, with results that were, charitably, inconsistent. Then, on May 20, 2026, OpenAI announced that an internal reasoning model had disproved a conjecture that had stood unchallenged for nearly 80 years.
Four years to go from arithmetic errors to solving what generations of mathematicians could not. Nobody anticipated that trajectory.
The problem itself, the unit distance problem, is unusually easy to state for a research-level question. Place several points on a flat plane and ask how many pairs can be separated by exactly one unit of distance. Paul Erdős posed the question in 1946 and believed that the best possible arrangement resembled a simple square grid. For eighty years, nearly every mathematician who studied the problem agreed with him.
OpenAI’s model disproved that assumption entirely. It constructed an infinite family of configurations that beat the square grid, using tools borrowed from algebraic number theory, a field nobody had previously connected to this specific geometric question. Cambridge mathematician and Fields Medalist Timothy Gowers called it a milestone in AI-driven mathematics in a commentary solicited by OpenAI.
The detail that fascinated researchers most: the model found the solution on its first attempt. When OpenAI reran the same problem multiple times afterward, it landed on the correct answer roughly half the time. That tells you something specific. The model cannot reliably verify its own logic. But it also isn’t producing noise that vaguely resembles mathematics. It occupies a strange middle territory that doesn’t yet have a name.
Why Mathematicians Call the Proof Ugly?
What makes this result genuinely uncomfortable for the field isn’t the result itself. It’s how the model got there.
The construction, according to mathematicians who reviewed it, is inelegant. It pulls in machinery from a completely unrelated branch of mathematics and applies it in a way that feels disconnected from the original geometric question. A human mathematician would likely never have walked down that path, not because they lacked the technical ability, but because they lacked the taste for it.
This sounds like a minor stylistic quibble. It isn’t. In 1940, British mathematician G.H. Hardy wrote a famous short book in which he argued that elegance functions as a criterion of mathematical truth, on par with logical rigor. For Hardy, a proof that merely demonstrates a result without revealing why that result is true misses the entire point of the exercise. Generations of mathematicians since have worked by that compass.
AI has no concept of elegance. It isn’t trying to understand why something is true. It is trying to show that it is true, and that distinction is precisely what makes it formidable. The model explores paths that human researchers avoid, not from incapacity but from good taste, and sometimes those unfashionable paths lead somewhere real.
The Panic and the Counterattack That Followed
The reaction inside academic mathematics matched the scale of the shock. Scott Aaronson, a computer scientist at the University of Texas, described doctoral students arriving at his office visibly shaken, worried that the path they had chosen for their careers might already be closing. The academic world reacts to automation differently from industry. In the industry, people fear AI will take their jobs. In academia, researchers fear it will take their jobs before they ever get the chance to start one.
The eulogy turned out to be premature. Within hours, not days, Princeton mathematician Will Sawin took the same underlying method the AI had discovered and improved it by hand. The AI had proven its approach worked, but had not measured how much better it was than the old grid construction. Sawin took the same ideas, cleaned up the calculations, and arrived at a result that was, in explicit terms, billions of times stronger, over a single weekend.
That sequence is worth sitting with. The machine opened a door nobody had noticed was unlocked. A human ran through it.
Sixteen Mathematicians Draw a Line
The second fracture in the “math is finished” narrative came from the community itself, and it carried far more institutional weight.
On June 2, 2026, sixteen mathematicians from fifteen universities, including Oxford, Cambridge, Columbia, and ETH Zurich, published the Leiden Declaration on Artificial Intelligence and Mathematics, an eleven-page founding document endorsed by the International Mathematical Union. The declaration states plainly that AI threatens the integrity of mathematical proof, the system by which credit for results is attributed, and the autonomy of mathematical research itself. It includes a line that has been quoted increasingly often since: the technology industry has a strong commercial incentive to overstate what these systems can actually do.
That observation, notably, comes from inside the discipline rather than from outside critics, and it helps explain why mainstream media outlets suddenly developed an interest in a combinatorial geometry problem nobody outside the field had previously heard of.
The declaration’s growth has been unusually fast for an academic document. It launched with 37 signatories. Within days, it passed 1,500. As of this writing, it approaches 2,000, including Fields Medalist Peter Scholze and Terence Tao, widely regarded as the most respected living mathematician. This is not an informal petition. It is a structured disciplinary movement that will be formally presented in late July at the International Congress of Mathematicians in Philadelphia, the same quadrennial gathering where the Fields Medal, mathematics’ equivalent of a Nobel Prize, is awarded.
Real Researchers Built Their Own Benchmark
While mathematicians organized, the underlying evidence kept accumulating. On June 10, 2026, the First Proof project published its initial results: an anti-corporate benchmark, designed not by AI labs but by working research mathematicians from Harvard, Princeton, and Columbia, built specifically to test what AI can actually do against problems drawn from their own active research, rather than problems chosen to flatter a particular model.
The top AI model achieved a score of approximately 60–70% on tasks deemed mostly accurate. Genuinely impressive. But the graders, who spent two full days at Harvard evaluating submissions, flagged something equally important: the models also produced enormous volumes of unusable output, brilliant insight buried inside noise. Separating signal from noise remains, for now, a job that requires a human.
Is a Proof Still a Proof If No One Understands It?
This question cuts deeper than benchmark scores. In 1976, the four-color theorem became the first major result proven with computer assistance. The mathematical community largely dismissed it as a true proof, as they were unable to verify or understand it themselves.
Fifty years later, the same argument has resurfaced, except the stakes are larger. OpenAI’s proof runs to roughly 125 pages and crosses multiple subfields that very few individual specialists fully command at once. A paper published in Nature on June 8, 2026, titled “How AI is reshaping discovery in mathematics and physics,” reaches a more measured conclusion than either side of the public argument: AI does not replace human intuition in these fields. It reconfigures how questions get asked, explored, and understood.
An extraordinarily powerful research tool, not a replacement for the researcher.
What appears to be happening, concretely, is that decades of published mathematical literature contain enough latent connections, enough proofs sitting just out of reach, for a system capable of digesting massive volumes of information and testing thousands of approaches quickly to start finding low-hanging fruit. The same dynamic is unfolding in physics and chemistry. The Erdős counterexample connected two branches of mathematics nobody had previously linked. That isn’t creativity in the deepest sense. It’s a large-scale pattern combination. There will be more results like it, plenty more.
The actual unresolved question, the one nobody has settled with any confidence, is whether AI can eventually go further than this: beyond pattern extraction from existing human methods, beyond filling gaps in published literature, toward inventing genuinely new conceptual frameworks of its own. Current models, as built, don’t appear designed for that kind of leap. But the uncertainty here is thick enough that no serious expert is willing to rule it out entirely.
A Nature editorial published June 15, 2026, argued that mathematicians are right to establish these guardrails now, and that other scientific disciplines should follow their example without delay. The editorial drew a direct comparison to the Asilomar Conference on recombinant DNA, a 1975 gathering whose conclusions reshaped global scientific practice for decades afterward. When Nature reaches for that comparison, it isn’t a casual rhetorical flourish.
What the Calculator Already Taught Us
Mathematics is not finished. AI is going to transform how the discipline operates, just as the calculator transformed arithmetic and the computer transformed modeling.
The calculator comparison deserves more weight than it usually gets. When calculators entered classrooms in the 1970s and 80s, the public debate was, word for word, identical to today’s. Newspaper headlines warned that students would forget how to count, that an entire generation of intellectually dependent adults was being manufactured in real time. There were petitions. There were op-eds.
Forty years later, mental arithmetic stopped being taught as an end in itself. Some educators still value it, and reasonably so. But the actual shift was toward teaching problem framing, modeling, and structured reasoning. Competency didn’t decline. It relocated. The real question was never whether the calculator would replace mathematical thinking. It was what the calculator would display, and whether you would move with it.
Most people at the time didn’t ask that question. They asked the wrong one, loud, and missed the actual shift happening underneath them. The people who understood what tools like these displace, who know which questions to ask them, who can evaluate output critically enough to separate signal from noise, don’t lose value when a tool like this arrives. Their value multiplies.
This isn’t only about mathematicians. It applies to anyone whose work involves information, analysis, or structured reasoning of any kind. It applies far beyond the walls of a math department.
Do you think AI will eventually invent new mathematical frameworks, or is it permanently limited to recombining what already exists in the literature? Curious where you land.



