Tao: Kepler as High-Temp LLM in AI Science Era

AI cheapens hypothesis generation like Kepler's random trials on Brahe's data, but verification, depth, and judging long-term value remain human bottlenecks requiring judgment beyond RL.

Kepler's Empirical Grind as Prototype for AI Hypothesis Machines

Terence Tao recounts Johannes Kepler's path to the laws of planetary motion as a blend of wild theorizing and relentless data fitting, drawing direct parallels to high-temperature LLMs. Building on Copernicus's heliocentric circles and ancient observations, Kepler hypothesized Platonic solids nesting between planetary spheres—tetrahedron between Mercury and Venus, cube between Earth and Mars, and so on—for six known planets. This "Mysterium Cosmographicum" (1596) seemed divinely elegant but crumbled against Tycho Brahe's unprecedented naked-eye dataset, 10x more precise than prior records.

Kepler, after years of access struggles (including stealing Brahe's data post-mortem), fudged circles and geometries but eventually derived ellipses, equal areas in equal times (first two laws), and after a decade more, the period-distance power law (third law) via regression on six data points. Tao notes Kepler's luck: with scant points, he intuited tentativeness, unlike Johann Bode's later geometric progression fit, which predicted a missing planet (fulfilled by Ceres) but failed on Neptune—a fluke exposed by more data.

Dwarkesh Patel frames Kepler as a "high-temperature LLM," sampling absurd relations (planetary harmonies causing Earth's mi-fa-mi famine note) until verifiable hits on Brahe's "massive dataset." Tao agrees hypothesis cycling is key but stresses Brahe's data precision enabled it; without verification, it's slop. This mirrors modern data-first science: big datasets precede patterns, inverting hypothesis-then-test.

"Kepler was maybe one of the first early data scientists, but even he didn’t start with Tycho’s dataset and then analyze it. He had some preconceived theories first," says Tao, highlighting the grind of discarding failures.

AI Floods Science with Ideas, Starving Verification Pipelines

Tao argues AI has slashed idea generation costs to near-zero, like the internet did communication, flooding science with thousands of theories per problem. Pre-AI, peer review filtered amateur slop; now journals drown in AI-generated papers, overwhelming humans.

The new bottleneck: scalable verification, validation, and forward-progress assessment. Humans debate single papers to consensus over years; AI scale demands new structures. Patel probes detecting "unifying concepts" (like the bit from Bell Labs) amid billions of AI outputs. Tao invokes the "test of time": breakthroughs like deep learning or transformers languished as niches before exploding, dependent on culture, adoption, and future context—not isolatable metrics.

Decimal vs. Roman numerals or binary vs. trits illustrates path dependence; no objective RL score predicts fruitfulness. Even correct theories falter initially: Copernicus's circles underperformed Ptolemy's tweaked epicycles; Aristarchus's heliocentrism implied implausible stellar distances (parallax); Newton's gravity shocked with action-at-a-distance and mass equivalence, resolved later by Einstein.

"I think AI has driven the cost of idea generation down to almost zero... Now the bottleneck is different. We’re now in a situation where suddenly people can generate thousands of theories," Tao warns.

Depth Over Breadth: AI Enriches but Skirts True Understanding

Tao observes AI aids papers by broadening scope and enriching arguments but rarely deepens core insights. Selection bias amplifies this: reported AI discoveries cherry-pick hits, ignoring failures. A "deductive overhang" looms—vast unexplored consequences of known math await human deduction, as Newton unified Kepler empirically.

Can humans extract understanding from AI solutions? Tao says sometimes, via Lean formalizations or exploratory paths AI reveals, but often proofs stay opaque. He advocates a "semi-formal language" capturing scientists' imprecise discourse—sketches, analogies, partial arguments—beyond rigid proofs, aiding AI-human loops.

Patel questions AI solving via brute search: humans might reverse-engineer why. Tao counters with intuition's role; AI lacks judgment for fruitful paths amid combinatorial explosion.

"AI makes papers richer and broader, but not deeper," Tao summarizes.

Human-AI Hybrids Rule Math's Future, Tao's Workflow

Tao predicts human-AI hybrids dominating math longer than pure AI, as humans supply judgment, intuition, and unification. He spends ~20% time on email/collab, 30% reading/proving, 30% writing, 20% AI experiments—using tools for literature scans, formalizations (Lean), and exploration, but steering with heuristics.

No full AI math takeover soon; verification loops span decades, blending judgment heuristics uncodifiable into RL. Progress survives "epistemic hell" via unarticulated smarts.

"Human-AI hybrids will dominate math for a lot longer," Tao asserts.

Key Takeaways

  • Prioritize verification infrastructure: Build scalable systems for testing AI hypotheses at mass, beyond peer review.
  • Embrace test-of-time evaluation: Judge ideas by long-term adoption and extensions, not immediate metrics or citations.
  • Hunt deductive overhangs: Exhaust implications of existing theorems before new data; AI accelerates but humans unify.
  • Develop semi-formal languages: Capture scientists' natural discourse (sketches, analogies) for better AI integration.
  • Use AI for breadth, humans for depth: Leverage LLMs for exploration/literature, apply intuition to prune and deepen.
  • Collect precise data first: Like Brahe, invest in high-fidelity datasets to ground hypothesis swarms.
  • Beware selection bias: Track AI failures alongside wins to avoid hype.
  • In workflows, allocate ~20% to AI experiments: Scan, formalize, explore—but steer with human judgment.

Notable quotes:

  • "Traditionally, when we talk about the history of science, idea generation has always been the prestige part of science... But as you say, it has to be matched by an equal amount of verification, otherwise it’s slop." — Terence Tao
  • "The data was extremely important... You collect large datasets and then draw patterns from them to deduce thoughts. This is a little bit different from how science used to work." — Terence Tao
  • "Often, the ultimately correct theory initially is worse in many ways. Copernicus’s theory of the planets was less accurate than Ptolemy’s theory." — Terence Tao (in truncated section, emphasizing survival via judgment)
  • "We need a semi-formal language for the way that scientists actually talk to each other." — Terence Tao
  • "If AI solves a problem, can humans get understanding out of it?" — Dwarkesh Patel, prompting Tao's nuanced yes via paths and formalizations.

Summarized by x-ai/grok-4.1-fast via openrouter

8824 input / 2539 output tokens in 40475ms

© 2026 Edge