Opus 4.5 changed everything (Interview)
Episode
104 min
Read time
4 min
AI-Generated Summary
Key Takeaways
- ✓Agentic Model Quality Threshold: Sonnet 3.5 could build agentically but produced sloppy, spaghetti code that stalled on errors neither the developer nor the model could debug — because the developer hadn't written the code themselves. Opus 4.5 crossed a threshold where it one-shots functional native Windows tools using correct WinUI libraries, produces well-structured readable code, and completes full working apps in an afternoon. The practical test: Burke built a screen-capture-to-GIF tool, then extended it into a full screen recording editor in a few hours.
- ✓Personal Software Economics — The SaaS Killer Pattern: When a model reaches sufficient capability, replacing paid SaaS subscriptions with custom-built personal software becomes viable in a single afternoon. Burke replaced a paid routing app his wife used for her yard-sign business. Adam replaced an $500–$800/year invoicing service by prompting Claude Opus 4.6 with extended thinking to generate an optimal prompt, then handing that prompt to Augment Code's Auggie CLI overnight — waking to a working Rails app with invoicing, PDF generation, and email built in.
- ✓Subsidized Token Economics Won't Last: GitHub Copilot at $40/month offers request-based billing where one agent run doing 6,000 operations counts as a single request, with 1,500 requests per month included. Claude's $200/month max plan supports roughly one billion tokens monthly at an estimated provider cost of $25,000 — a $24,800 subsidy per user. Burke explicitly states this pricing cannot persist indefinitely and developers should maximize usage now, treating the current window as a finite opportunity before costs normalize.
- ✓Plan Mode as Context Extraction, Not Documentation: The value of agent plan mode is not producing a written plan — it is forcing the model to surface all the requirements and constraints the developer forgot to specify in the initial prompt. Running four to six planning loops with Opus 4.6 before execution dramatically improves output quality. Burke's current workflow: plan mode in Copilot CLI → autopilot with a custom agent called Anvil → confidence-threshold loop (targeting 95% confidence rather than "done") → verifiable output check via browser skill or unit tests.
- ✓Multi-Model Orchestration as Standard Workflow: Within GitHub Copilot, developers can route different subtasks to different models in a single run. Burke's Anvil agent classifies tasks as easy, medium, or hard, then delegates design work to Gemini, code refactoring to GPT-5.3 Codex, and planning/communication to Claude Opus 4.6 — potentially spawning 26 parallel sub-agents on large refactors. The practical framing: use Opus 4.6 as the communicative team lead and GPT-5.3 as the senior engineer who writes the actual code without needing to be pleasant about it.
What It Covers
Adam Stacoviak of The Changelog interviews Burke Holland from GitHub Copilot about how Claude Opus 4.5, released around December 2024, created a measurable step-function improvement in agentic coding capability. The conversation covers practical AI-assisted development workflows, the economics of subsidized model access, the future of software craftsmanship, and whether developers will be replaced or transformed into polymaths.
Key Questions Answered
- •Agentic Model Quality Threshold: Sonnet 3.5 could build agentically but produced sloppy, spaghetti code that stalled on errors neither the developer nor the model could debug — because the developer hadn't written the code themselves. Opus 4.5 crossed a threshold where it one-shots functional native Windows tools using correct WinUI libraries, produces well-structured readable code, and completes full working apps in an afternoon. The practical test: Burke built a screen-capture-to-GIF tool, then extended it into a full screen recording editor in a few hours.
- •Personal Software Economics — The SaaS Killer Pattern: When a model reaches sufficient capability, replacing paid SaaS subscriptions with custom-built personal software becomes viable in a single afternoon. Burke replaced a paid routing app his wife used for her yard-sign business. Adam replaced an $500–$800/year invoicing service by prompting Claude Opus 4.6 with extended thinking to generate an optimal prompt, then handing that prompt to Augment Code's Auggie CLI overnight — waking to a working Rails app with invoicing, PDF generation, and email built in.
- •Subsidized Token Economics Won't Last: GitHub Copilot at $40/month offers request-based billing where one agent run doing 6,000 operations counts as a single request, with 1,500 requests per month included. Claude's $200/month max plan supports roughly one billion tokens monthly at an estimated provider cost of $25,000 — a $24,800 subsidy per user. Burke explicitly states this pricing cannot persist indefinitely and developers should maximize usage now, treating the current window as a finite opportunity before costs normalize.
- •Plan Mode as Context Extraction, Not Documentation: The value of agent plan mode is not producing a written plan — it is forcing the model to surface all the requirements and constraints the developer forgot to specify in the initial prompt. Running four to six planning loops with Opus 4.6 before execution dramatically improves output quality. Burke's current workflow: plan mode in Copilot CLI → autopilot with a custom agent called Anvil → confidence-threshold loop (targeting 95% confidence rather than "done") → verifiable output check via browser skill or unit tests.
- •Multi-Model Orchestration as Standard Workflow: Within GitHub Copilot, developers can route different subtasks to different models in a single run. Burke's Anvil agent classifies tasks as easy, medium, or hard, then delegates design work to Gemini, code refactoring to GPT-5.3 Codex, and planning/communication to Claude Opus 4.6 — potentially spawning 26 parallel sub-agents on large refactors. The practical framing: use Opus 4.6 as the communicative team lead and GPT-5.3 as the senior engineer who writes the actual code without needing to be pleasant about it.
- •Conceptual Knowledge Accelerates Faster Than Syntax Knowledge: Developers using AI are learning architectural concepts — ETL pipelines, Medallion architecture (bronze/silver/gold layers), UNIX sockets in Go, gRPC versus REST — at a rate impossible through traditional syntax-focused learning. The mechanism is iterative brainstorming: a developer brings a conceptual direction, the model explains the implementation landscape, the developer iterates, and the concept becomes table stakes within days. This expands rather than contracts developer knowledge, shifting the valuable skill from writing functions to directing architectural decisions.
- •Production Shipping Remains the Unsolved Gap: Vibe-coded projects proliferate but rarely reach production because deployment, security, architecture decisions, SLA management, and error handling still require substantial developer expertise. The analogy Burke uses: code has never been the hard part — getting software into production always has been, and AI has not changed that. Teams like VS Code's engineering group are actively building AI-assisted workflows specifically for production-quality software, where the editor cannot break for millions of users, and they do not yet have a complete answer.
Notable Moment
Burke describes letting a GitHub Copilot CLI agent run continuously in a loop for days, autonomously deciding which features to add to a multiplayer game where users roleplay as baby birds. He acknowledges it will likely produce an unwieldy, unshippable result — and frames this as an honest illustration of exactly where autonomous agentic software development currently breaks down at scale.
You just read a 3-minute summary of a 101-minute episode.
Get The Changelog summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from The Changelog
Exploring with agents (Interview)
Apr 24 · 96 min
Masters of Scale
Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers
Apr 25
More from The Changelog
Astral has been acquired by OpenAI (News)
Mar 27 · 10 min
The Futur
Why Process is Better Than AI w/ Scott Clum | Ep 430
Apr 25
More from The Changelog
We summarize every new episode. Want them in your inbox?
Similar Episodes
Related episodes from other podcasts
Masters of Scale
Apr 25
Possible: Netflix co-founder Reed Hastings: stories, schools, superpowers
The Futur
Apr 25
Why Process is Better Than AI w/ Scott Clum | Ep 430
20VC (20 Minute VC)
Apr 25
20Product: Replit CEO on Why Coding Models Are Plateauing | Why the SaaS Apocalypse is Justified: Will Incumbents Be Replaced? | Why IDEs Are Dead and Do PMs Survive the Next 3-5 Years with Amjad Masad
This Week in Startups
Apr 25
The Defense Tech Startup YC Kicked Out of a Meeting is Now Arming America | E2280
Marketplace
Apr 24
When does AI become a spending suck?
This podcast is featured in Best Cybersecurity Podcasts (2026) — ranked and reviewed with AI summaries.
You're clearly into The Changelog.
Every Monday, we deliver AI summaries of the latest episodes from The Changelog and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime