The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray
Episode
68 min
Read time
3 min
Topics
Remote Work
AI-Generated Summary
Key Takeaways
- ✓Agent Architecture — Out-of-Box vs In-Box: Running the agent harness outside the sandbox is more complex but architecturally superior for security. When the agent runs inside the sandbox, secrets must live there too, creating exfiltration risk. The out-of-box approach separates the "brain" in a control plane from the "hands" in the sandbox, allowing scoped credentials per machine and cleaner permission boundaries across multi-user environments.
- ✓VM Infrastructure Over Docker: Full virtual machines outperform Docker containers for coding agents for two reasons: Docker is not a true security boundary, and real applications often use Docker internally, creating nested Docker-in-Docker conflicts. Cognition built a custom block-diff file storage format so VMs only write changes proportional to the file system diff, dramatically reducing boot and restore times for agent sessions.
- ✓Repo Setup as the Persistent Bottleneck: Getting agents to run, test, and interact with a codebase autonomously requires a working local developer environment — including Docker Compose, local databases, and scoped credentials. Most companies lack this infrastructure, especially older ones built before containerization. Teams should prioritize local dev environment setup before deploying background agents, as agents cannot ask "Bob" for secrets.
- ✓Memory Generation and Retrieval Remain Unsolved: Cognition's production memory system auto-generates memories when users correct Devin, with ~95% of stored memories created automatically rather than manually written. The core challenge is dual: generation must avoid over-generalizing one-off preferences into permanent rules, and retrieval must surface relevant memories without flooding context. Agents editing memory files directly, treating memory like a navigable file system, is an emerging alternative approach.
- ✓AI Code Slop Patterns Require Lint Guards: Specific anti-patterns emerge consistently from AI-generated code: `getattr` used defensively even when attributes are known, untyped `dict[str, Any]` returns, backwards-compatibility shims that add unnecessary import-export layers, and excessive inline documentation. Teams should encode these as Semgrep or lint rules that fail pull requests automatically, preventing AI patterns from cementing into the codebase as reference examples for future generations.
What It Covers
Walden Yan from Cognition and Cole Murray from OpenInspect examine the architecture of background coding agents, covering the technical decisions behind building cloud-based development systems. Cognition's internal data shows Devin-authored commits grew from 16% to 80% of all commits between January and March 2025, while engineering headcount grew only 10%.
Key Questions Answered
- •Agent Architecture — Out-of-Box vs In-Box: Running the agent harness outside the sandbox is more complex but architecturally superior for security. When the agent runs inside the sandbox, secrets must live there too, creating exfiltration risk. The out-of-box approach separates the "brain" in a control plane from the "hands" in the sandbox, allowing scoped credentials per machine and cleaner permission boundaries across multi-user environments.
- •VM Infrastructure Over Docker: Full virtual machines outperform Docker containers for coding agents for two reasons: Docker is not a true security boundary, and real applications often use Docker internally, creating nested Docker-in-Docker conflicts. Cognition built a custom block-diff file storage format so VMs only write changes proportional to the file system diff, dramatically reducing boot and restore times for agent sessions.
- •Repo Setup as the Persistent Bottleneck: Getting agents to run, test, and interact with a codebase autonomously requires a working local developer environment — including Docker Compose, local databases, and scoped credentials. Most companies lack this infrastructure, especially older ones built before containerization. Teams should prioritize local dev environment setup before deploying background agents, as agents cannot ask "Bob" for secrets.
- •Memory Generation and Retrieval Remain Unsolved: Cognition's production memory system auto-generates memories when users correct Devin, with ~95% of stored memories created automatically rather than manually written. The core challenge is dual: generation must avoid over-generalizing one-off preferences into permanent rules, and retrieval must surface relevant memories without flooding context. Agents editing memory files directly, treating memory like a navigable file system, is an emerging alternative approach.
- •AI Code Slop Patterns Require Lint Guards: Specific anti-patterns emerge consistently from AI-generated code: `getattr` used defensively even when attributes are known, untyped `dict[str, Any]` returns, backwards-compatibility shims that add unnecessary import-export layers, and excessive inline documentation. Teams should encode these as Semgrep or lint rules that fail pull requests automatically, preventing AI patterns from cementing into the codebase as reference examples for future generations.
- •SRE Auto-Triage as the Highest-ROI Entry Point: The most common and immediately valuable background agent use case is first-responder triage on alerts from Datadog, Sentry, or Slack. The agent does not need to resolve incidents — collecting full context, referencing playbooks, and drafting a pull request before a human reviews delivers compressive value. OpenInspect supports generic webhooks for this trigger; teams report spending between $1,000 and $5,000 per engineer monthly on agent compute for this workflow.
Notable Moment
Cognition ran an internal experiment building a full product using autonomous agents with auto-merge and zero code review. By the two-week mark, changing a single button color required touching ten different implementations. The conclusion: scheduled human-led or agent-led cleanup of duplication is necessary, or codebases regress toward their worst contributor's patterns.
You just read a 3-minute summary of a 65-minute episode.
Get Latent Space summarized like this every Monday — plus up to 2 more podcasts, free.
Pick Your Podcasts — FreeKeep Reading
More from Latent Space
🔬ESMFold2: The Bitter Lesson is Coming for Proteins - Alex Rives, BioHub
May 27 · 70 min
Up First (NPR)
Israel Ramps Up Attacks Amid Iran Talks, E. Jean Carroll Investigation, CBS Overhaul
May 29
More from Latent Space
Giving Agents Computers — Ivan Burazin, Daytona
May 21 · 70 min
The Daily (NYT)
Stranded in the Strait of Hormuz
May 29
More from Latent Space
We summarize every new episode. Want them in your inbox?
🔬ESMFold2: The Bitter Lesson is Coming for Proteins - Alex Rives, BioHub
Giving Agents Computers — Ivan Burazin, Daytona
Railway: The Agent-Native Cloud — Jake Cooper
The Next War Is Already Here. The West Isn't Ready. — Yaroslav Azhnyuk, The Fourth Law & Guest Host Noah Smith, Noahpinion
AI-Native Healthcare: 100M Doctor Visits, 10–20 Hours Saved, Prior Auth in Minutes — Janie Lee & Chai Asawa, Abridge
Similar Episodes
Related episodes from other podcasts
Up First (NPR)
May 29
Israel Ramps Up Attacks Amid Iran Talks, E. Jean Carroll Investigation, CBS Overhaul
The Daily (NYT)
May 29
Stranded in the Strait of Hormuz
10% Happier with Dan Harris
May 29
Anxiety Narrows Your Brain. Here's How to Widen It Back Out. | Susa Talan
Feel Better, Live More
May 28
BITESIZE | The 5 Minute Habits That Can Transform Your Health | Dr Rangan Chatterjee and Dr Ayan Panja #661
The Tim Ferriss Show
May 28
#867: Dr. Becky Kennedy — Parenting Strategies for Raising Resilient Kids, Plus Word-for-Word Scripts for Repairing Relationships, Setting Boundaries, and More (Repost)
Explore Related Topics
This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.
You're clearly into Latent Space.
Every Monday, we deliver AI summaries of the latest episodes from Latent Space and 192+ other podcasts. Free for up to 3 shows.
Start My Monday DigestNo credit card · Unsubscribe anytime