🔬Doing Vibe Physics — Alex Lupsasca, OpenAI
Episode
91 min
Read time
3 min
Topics
Productivity, Startups, Fundraising & VC
AI-Generated Summary
Key Takeaways
- ✓AI Research Inflection Points: Three distinct capability jumps mark AI's entry into frontier physics: o3 solved a calculation in 11 minutes that would have taken days; GPT-5 reproduced a published paper's hardest derivation in 30 minutes; and an internal OpenAI model spent 12 hours independently rediscovering and proving a formula that three expert physicists could not crack in over a year of sustained effort.
- ✓Vibe Physics Workflow: The gluon amplitude paper was produced by feeding known formulas into GPT-5.2 Pro, asking it to simplify, then requesting a general-case conjecture. The model ran Python across 5,000 cases autonomously, reduced 32-term expressions to 4-term products, and proposed a linearly-scaling formula — replacing factorial growth. Researchers then used a separate internal model in a fresh session to independently verify and prove the conjecture without being given the answer.
- ✓Graviton Paper in Days, Not Months: The graviton amplitude result — mathematically distinct from the gluon case — was produced in roughly three days using only publicly available GPT-5.2 Pro. Researchers provided the gluon paper as context, wrote two paragraphs of steering instructions, and the model applied the directed matrix tree theorem unprompted. The three-week publication delay was spent on human verification and writeup, not on derivation.
- ✓Two Concrete Research Accelerators: AI compresses two specific bottlenecks in physics research. First, confusion time — the days spent reconciling contradictory results — drops sharply when a model can immediately identify overlooked assumptions. Second, researchers can now launch parallel "scout" sessions across 10 different approaches simultaneously, getting rapid signal on which directions are viable before committing to the full calculation, replacing the sequential trial-and-error that previously defined theoretical work.
- ✓The Verification Bottleneck Replaces the Derivation Bottleneck: As models handle derivations, human effort shifts almost entirely to checking outputs. This creates a new constraint: models don't consistently signal confidence levels on individual steps, making it hard to know where to focus scrutiny. Lupsasca identifies two near-term model improvements needed — better calibration of expressed uncertainty on specific steps, and integration of formal verification tools like Lean to automate output checking at scale.
What It Covers
Vanderbilt physicist and OpenAI fellow Alex Lupsasca describes how GPT models solved two open problems in theoretical physics — single-minus gluon and graviton tree amplitudes — that stumped expert researchers for over a year. The episode traces AI's progression from email assistant to quantum field theory collaborator, covering methodology, implications for scientific training, and the verification bottleneck now facing researchers.
Key Questions Answered
- •AI Research Inflection Points: Three distinct capability jumps mark AI's entry into frontier physics: o3 solved a calculation in 11 minutes that would have taken days; GPT-5 reproduced a published paper's hardest derivation in 30 minutes; and an internal OpenAI model spent 12 hours independently rediscovering and proving a formula that three expert physicists could not crack in over a year of sustained effort.
- •Vibe Physics Workflow: The gluon amplitude paper was produced by feeding known formulas into GPT-5.2 Pro, asking it to simplify, then requesting a general-case conjecture. The model ran Python across 5,000 cases autonomously, reduced 32-term expressions to 4-term products, and proposed a linearly-scaling formula — replacing factorial growth. Researchers then used a separate internal model in a fresh session to independently verify and prove the conjecture without being given the answer.
- •Graviton Paper in Days, Not Months: The graviton amplitude result — mathematically distinct from the gluon case — was produced in roughly three days using only publicly available GPT-5.2 Pro. Researchers provided the gluon paper as context, wrote two paragraphs of steering instructions, and the model applied the directed matrix tree theorem unprompted. The three-week publication delay was spent on human verification and writeup, not on derivation.
- •Two Concrete Research Accelerators: AI compresses two specific bottlenecks in physics research. First, confusion time — the days spent reconciling contradictory results — drops sharply when a model can immediately identify overlooked assumptions. Second, researchers can now launch parallel "scout" sessions across 10 different approaches simultaneously, getting rapid signal on which directions are viable before committing to the full calculation, replacing the sequential trial-and-error that previously defined theoretical work.
- •The Verification Bottleneck Replaces the Derivation Bottleneck: As models handle derivations, human effort shifts almost entirely to checking outputs. This creates a new constraint: models don't consistently signal confidence levels on individual steps, making it hard to know where to focus scrutiny. Lupsasca identifies two near-term model improvements needed — better calibration of expressed uncertainty on specific steps, and integration of formal verification tools like Lean to automate output checking at scale.
- •Graduate Training Has No Clear Answer Yet: The traditional PhD model relies on professors giving students "safe" problems — questions with known solutions — to build technical confidence over six-month cycles. Models can now solve many of those training problems in under 30 minutes. Lupsasca states academia has no established replacement framework. The skill that transfers most directly to effective AI collaboration is the same skill developed by advising humans: knowing how to frame a question at the right level of specificity for a given collaborator.
- •Raising the Bar Rather Than Increasing Volume: Models can now produce a publishable physics paper per day on incremental questions. Lupsasca argues the correct response is not to maximize output but to target harder problems — specifically, questions that have blocked entire research communities for decades rather than individual groups for months. The single-minus amplitude results open a line of attack on quantum gravity questions, and the goal is to use AI to reach problems that previously had no viable computational pathway.
Notable Moment
When Lupsasca gave GPT-5 Pro a black hole symmetry problem he had personally solved and published — with a training cutoff predating that paper — the model initially failed. After being given the simpler flat-space warm-up version as a primer, it then solved the full black hole problem in 18 minutes, reproducing one of Lupsasca's most technically demanding results without access to his paper.
You just read a 3-minute summary of a 88-minute episode.
Get Latent Space summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Latent Space
The Professor of Outputmaxxing — Anjney Midha, AMP
Jun 18 · 59 min
Moonshots with Peter Diamandis
Financializing Super Intelligence, Amazon's $50B Late Fee | #235
Mar 5
More from Latent Space
🔬 The Self-Driving Lab — Joseph Krause, Radical AI
Jun 17 · 76 min
Moonshots with Peter Diamandis
OpenAI Acquires OpenClaw, 400x Cost Collapse, & Why India Wins the Talent War | EP #231
Feb 18
More from Latent Space
We summarize every new episode. Want them in your inbox?
The Professor of Outputmaxxing — Anjney Midha, AMP
🔬 The Self-Driving Lab — Joseph Krause, Radical AI
Reality: The Final Eval — Lukas Petersson and Axel Backlund of Andon Labs
🔬Scaling Past Informal AI - Carina Hong, Axiom Math
⚡️Satya Nadella: No Priors x Latent Space Crossover Special at Microsoft Build
Similar Episodes
Related episodes from other podcasts
Moonshots with Peter Diamandis
Mar 5
Financializing Super Intelligence, Amazon's $50B Late Fee | #235
Moonshots with Peter Diamandis
Feb 18
OpenAI Acquires OpenClaw, 400x Cost Collapse, & Why India Wins the Talent War | EP #231
Cognitive Revolution
Feb 8
AGI-Pilled Cyber Defense: Automating Digital Forensics w/ Asymmetric Security CEO Alexis Carlier
Hard Fork
Aug 8
GPT-5 Arrives, and We Try the New Alexa+
This Week in Startups
Jun 18
Why SpaceX Buying Cursor Changes Everything
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
Read this week's Startups & Product Podcast Insights — cross-podcast analysis updated weekly.
You're clearly into Latent Space.
Every Monday, we deliver AI summaries of the latest episodes from Latent Space and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime