Skip to main content
NVIDIA AI Podcast

Harrison Chase of LangChain on Deep Agents, LangSmith, and Earning Trust | NVIDIA AI Podcast Ep. 297

24 min episode · 2 min read
·

Episode

24 min

Read time

2 min

Topics

Artificial Intelligence

AI-Generated Summary

Key Takeaways

  • Deep Agents Architecture: LangChain's deep agents harness unifies patterns from Claude Code, Manus, and Deep Research into one customizable framework. Rather than rebuilding scaffolding per use case, developers adjust prompts and tools. The harness includes a file system, bash tool, and sub-agent support, making coding-capable models like Qwen Coder stronger general-purpose agents than non-coding equivalents.
  • Evaluation-Driven Development: Start eval datasets with as few as 5–10 scenarios, not thousands. Define what a good and bad response looks like for each. Run the dataset after every prompt change to measure improvement or regression. Expand the dataset continuously as real users surface unexpected but legitimate behaviors during limited rollouts to alpha users or 1% of traffic.
  • Agent Rebuild Cadence: Enterprise teams should plan to rebuild agents on updated harnesses roughly every nine months. Architectures from 18 months ago limit both performance and scope. Models that previously couldn't handle complex tasks now can, so teams not reevaluating scope are leaving measurable capability gains unrealized, particularly for tasks that were previously out of reach.
  • Open Models for Always-On Agents: Frontier model costs become prohibitive when agents run proactively every 10 minutes rather than on-demand. Open-source models, now approaching frontier capability for driving agent harnesses, make always-on event-driven agents economically viable. LangChain joined NVIDIA's Nemotron coalition specifically to advance open models that run reliably within open harnesses and runtimes like NVIDIA's OpenShell.
  • LangSmith Build-Test-Run-Manage Cycle: LangSmith covers the full agent development lifecycle beyond just observability. The build phase uses open-source frameworks like LangGraph or deep agents. LangSmith then handles testing with eval datasets, scaled deployment, and runtime observability. Enterprises that separate these phases by months risk shipping on outdated architectures; the platform is designed to compress that cycle significantly.

What It Covers

Harrison Chase, CEO of LangChain, explains how deep agents work as a general-purpose, model-agnostic harness built on patterns from Claude Code, Manus, and Deep Research. He covers LangSmith's observability and evaluation tools, open-source model viability, and three near-term shifts: async sub-agents, always-on event-driven agents, and agent identity.

Key Questions Answered

  • Deep Agents Architecture: LangChain's deep agents harness unifies patterns from Claude Code, Manus, and Deep Research into one customizable framework. Rather than rebuilding scaffolding per use case, developers adjust prompts and tools. The harness includes a file system, bash tool, and sub-agent support, making coding-capable models like Qwen Coder stronger general-purpose agents than non-coding equivalents.
  • Evaluation-Driven Development: Start eval datasets with as few as 5–10 scenarios, not thousands. Define what a good and bad response looks like for each. Run the dataset after every prompt change to measure improvement or regression. Expand the dataset continuously as real users surface unexpected but legitimate behaviors during limited rollouts to alpha users or 1% of traffic.
  • Agent Rebuild Cadence: Enterprise teams should plan to rebuild agents on updated harnesses roughly every nine months. Architectures from 18 months ago limit both performance and scope. Models that previously couldn't handle complex tasks now can, so teams not reevaluating scope are leaving measurable capability gains unrealized, particularly for tasks that were previously out of reach.
  • Open Models for Always-On Agents: Frontier model costs become prohibitive when agents run proactively every 10 minutes rather than on-demand. Open-source models, now approaching frontier capability for driving agent harnesses, make always-on event-driven agents economically viable. LangChain joined NVIDIA's Nemotron coalition specifically to advance open models that run reliably within open harnesses and runtimes like NVIDIA's OpenShell.
  • LangSmith Build-Test-Run-Manage Cycle: LangSmith covers the full agent development lifecycle beyond just observability. The build phase uses open-source frameworks like LangGraph or deep agents. LangSmith then handles testing with eval datasets, scaled deployment, and runtime observability. Enterprises that separate these phases by months risk shipping on outdated architectures; the platform is designed to compress that cycle significantly.

Notable Moment

Chase noted that coding-focused models outperform general-purpose models as agent drivers because the deep agents harness structurally resembles a coding environment — with file systems and bash tools. This means model selection for agentic tasks should prioritize coding benchmark performance over general reasoning scores alone.

Know someone who'd find this useful?

You just read a 3-minute summary of a 21-minute episode.

Get NVIDIA AI Podcast summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

More from NVIDIA AI Podcast

We summarize every new episode. Want them in your inbox?

Similar Episodes

Related episodes from other podcasts

Explore Related Topics

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into NVIDIA AI Podcast.

Every Monday, we deliver AI summaries of the latest episodes from NVIDIA AI Podcast and 192+ other podcasts. Free for up to 3 shows.

Start My Monday Digest

No credit card · Unsubscribe anytime