What are the key takeaways from this How I AI episode?

Key insights include: **Velocity Measurement:** Use merged pull requests per R&D head as a leading indicator of AI adoption effectiveness. Intercom tracked this metric from baseline through a 9-month Claude Code rollout, achieving 2x throughput. The raw PR count grew even higher since headcount also increased during this period. A crude metric beats no metric when building organizational accountability around AI tooling adoption.; **Skills Distribution via IT Systems:** Deploy Claude Code plugins through internal IT infrastructure rather than relying on Claude's native plugin sync mechanism, which proved unreliable across hundreds of laptops. Pushing skill files directly to disk via IT management tools eliminates version drift, reduces debugging overhead, and ensures every engineer runs identical, current tooling without manual intervention or update failures.; **LLM Judges for Quality Regression Detection:** After Claude Code began generating low-quality PR descriptions (summarizing code rather than intent), Intercom built an LLM judge to evaluate months of historical PR description data. The judge confirmed a downward trend, prompting a mandatory "create PR" skill enforced via hooks that block the GitHub CLI. Post-intervention, the LLM judge confirmed quality returned to above-baseline levels.

What did Claude Code discuss on How I AI?

Brian Scanlan, Senior Principal Engineer at Intercom, details how the company doubled engineering throughput (measured in merged PRs per R&D head) over nine months using Claude Code. He demonstrates the internal skills repository, telemetry infrastructure, session analysis tooling, and cultural frameworks that enabled a 150+ person R&D organization to ship at 2x velocity while maintaining or improving code quality. Key topics include: **Velocity Measurement:** Use merged pull requests per R&D head as a leading indicator of AI adoption effectiveness. Intercom tracked this metric from baseline through a 9-month Claude Code rollout, achieving 2x throughput. The raw PR count grew even higher since headcount also increased during this period. A crude metric beats no metric when building organizational accountability around AI tooling adoption.; **Skills Distribution via IT Systems:** Deploy Claude Code plugins through internal IT infrastructure rather than relying on Claude's native plugin sync mechanism, which proved unreliable across hundreds of laptops. Pushing skill files directly to disk via IT management tools eliminates version drift, reduces debugging overhead, and ensures every engineer runs identical, current tooling without manual intervention or update failures..

How long is this episode of How I AI?

This episode is 78 minutes long. SignalCast provides an AI-generated summary so you can get the key insights in about 3 minutes.

How I AI

How Intercom 2x’d their engineering velocity in 9 months with Claude Code | Brian Scanlan

April 20, 2026

78 min episode · 3 min read

Claude Code

Episode

78 min

Read time

3 min

Topics

Productivity, Leadership, Artificial Intelligence

AI-Generated Summary

Published Apr 20, 2026

Key Takeaways

✓Velocity Measurement: Use merged pull requests per R&D head as a leading indicator of AI adoption effectiveness. Intercom tracked this metric from baseline through a 9-month Claude Code rollout, achieving 2x throughput. The raw PR count grew even higher since headcount also increased during this period. A crude metric beats no metric when building organizational accountability around AI tooling adoption.
✓Skills Distribution via IT Systems: Deploy Claude Code plugins through internal IT infrastructure rather than relying on Claude's native plugin sync mechanism, which proved unreliable across hundreds of laptops. Pushing skill files directly to disk via IT management tools eliminates version drift, reduces debugging overhead, and ensures every engineer runs identical, current tooling without manual intervention or update failures.
✓LLM Judges for Quality Regression Detection: After Claude Code began generating low-quality PR descriptions (summarizing code rather than intent), Intercom built an LLM judge to evaluate months of historical PR description data. The judge confirmed a downward trend, prompting a mandatory "create PR" skill enforced via hooks that block the GitHub CLI. Post-intervention, the LLM judge confirmed quality returned to above-baseline levels.
✓Session Telemetry for Org-Level Diagnostics: Collect Claude Code session JSON files, anonymize them, upload to S3, and build user-level dashboards showing session efficiency percentiles, skill invocation patterns, and dropout rates. This surfaces systemic problems—like an MCP never triggering correctly—that are invisible without aggregate data. Honeycomb works well for real-time skill invocation tracking across the engineering organization.
✓Self-Improving Skills via Feedback Loops: Build skills that update themselves when they encounter novel solutions. Intercom's flaky spec skill fixes a test, documents the new pattern back into the skill file, then fans out to find all similar failing tests. This compounds from roughly 1x performance at launch to 10x or higher as the skill accumulates domain-specific patterns, without requiring ongoing human maintenance.

What It Covers

Brian Scanlan, Senior Principal Engineer at Intercom, details how the company doubled engineering throughput (measured in merged PRs per R&D head) over nine months using Claude Code. He demonstrates the internal skills repository, telemetry infrastructure, session analysis tooling, and cultural frameworks that enabled a 150+ person R&D organization to ship at 2x velocity while maintaining or improving code quality.

Key Questions Answered

•Velocity Measurement: Use merged pull requests per R&D head as a leading indicator of AI adoption effectiveness. Intercom tracked this metric from baseline through a 9-month Claude Code rollout, achieving 2x throughput. The raw PR count grew even higher since headcount also increased during this period. A crude metric beats no metric when building organizational accountability around AI tooling adoption.
•Skills Distribution via IT Systems: Deploy Claude Code plugins through internal IT infrastructure rather than relying on Claude's native plugin sync mechanism, which proved unreliable across hundreds of laptops. Pushing skill files directly to disk via IT management tools eliminates version drift, reduces debugging overhead, and ensures every engineer runs identical, current tooling without manual intervention or update failures.
•LLM Judges for Quality Regression Detection: After Claude Code began generating low-quality PR descriptions (summarizing code rather than intent), Intercom built an LLM judge to evaluate months of historical PR description data. The judge confirmed a downward trend, prompting a mandatory "create PR" skill enforced via hooks that block the GitHub CLI. Post-intervention, the LLM judge confirmed quality returned to above-baseline levels.
•Session Telemetry for Org-Level Diagnostics: Collect Claude Code session JSON files, anonymize them, upload to S3, and build user-level dashboards showing session efficiency percentiles, skill invocation patterns, and dropout rates. This surfaces systemic problems—like an MCP never triggering correctly—that are invisible without aggregate data. Honeycomb works well for real-time skill invocation tracking across the engineering organization.
•Self-Improving Skills via Feedback Loops: Build skills that update themselves when they encounter novel solutions. Intercom's flaky spec skill fixes a test, documents the new pattern back into the skill file, then fans out to find all similar failing tests. This compounds from roughly 1x performance at launch to 10x or higher as the skill accumulates domain-specific patterns, without requiring ongoing human maintenance.
•Tech Debt as AI Onboarding Strategy: When introducing AI coding tools to an engineering team, direct engineers to spend one month fixing everything they hate about the codebase. The combination of low-friction execution and high emotional payoff builds AI tool fluency while delivering measurable quality improvements. Intercom migrated an entire Go microservice to Ruby in a single Claude Code session—previously a multi-month roadmap item requiring organizational consensus.

Notable Moment

Scanlan described how Intercom's CI system became ten times more expensive almost overnight once Claude Code adoption accelerated PR volume. After fixing those infrastructure bottlenecks, code review became the new constraint. The implication: AI coding tools will sequentially expose every weak point in a delivery pipeline, requiring teams to fix bottlenecks they previously never stressed.

Know someone who'd find this useful?

You just read a 3-minute summary of a 75-minute episode.

Get How I AI summarized like this every Monday — plus up to 2 more podcasts, free.

Pick Your Podcasts — Free

Keep Reading

How the founder of Morning Brew built a Claude content machine that never runs out of ideas and never sounds like slop | Alex Lieberman

Jul 20 · 42 min

SaaStr Podcast

SaaStr 839: Why Most SaaS Companies Will Fail at AI (And How to Avoid It) with Intercom's CPO

Jan 28

This solo builder runs 24/7 local AI on his own hardware | Alex Finn

Jul 13 · 35 min

20VC (20 Minute VC)

20Product: Inside Legora's Tech Stack: Why Token Maxing is Failing Enterprise Startups with Jacob Lauritzen, CTO @ Legora

Jun 6

Books, tools, and gear mentioned in this episode

SignalCast may earn commission on purchases via these links.

Tools

Celigo
by Celigo
“SPONSORS: Celigo”
Cursor
by Cursor
“SPONSORS: Cursor”
Claude CodeRecommended
by Anthropic
“Brian Scanlan, Senior Principal Engineer at Intercom, details how the company doubled engineering throughput (measured in merged PRs per R&D head) over nine months using Claude Code.”
HoneycombRecommended
by Honeycomb
“Honeycomb works well for real-time skill invocation tracking across the engineering organization.”

Similar Episodes

Related episodes from other podcasts

SaaStr Podcast

Jan 28

SaaStr 839: Why Most SaaS Companies Will Fail at AI (And How to Avoid It) with Intercom's CPO

20VC (20 Minute VC)

Jun 6

20Product: Inside Legora's Tech Stack: Why Token Maxing is Failing Enterprise Startups with Jacob Lauritzen, CTO @ Legora

Software Engineering Daily

May 28

Explore Related Topics

⚡Productivity 👔Leadership 🤖Artificial Intelligence

This podcast is featured in Best AI Podcasts (2026) — ranked and reviewed with AI summaries.

Read this week's AI & Machine Learning Podcast Insights — cross-podcast analysis updated weekly.

You're clearly into How I AI.

Every Monday, we deliver AI summaries of the latest episodes from How I AI and 192+ other podcasts. Free for one show.

Start My Monday Digest

No credit card · Unsubscribe anytime

How Intercom 2x’d their engineering velocity in 9 months with Claude Code | Brian Scanlan

AI-Generated Summary

Key Takeaways

What It Covers

Key Questions Answered

Notable Moment

Keep Reading

How the founder of Morning Brew built a Claude content machine that never runs out of ideas and never sounds like slop | Alex Lieberman

SaaStr 839: Why Most SaaS Companies Will Fail at AI (And How to Avoid It) with Intercom's CPO

This solo builder runs 24/7 local AI on his own hardware | Alex Finn

20Product: Inside Legora's Tech Stack: Why Token Maxing is Failing Enterprise Startups with Jacob Lauritzen, CTO @ Legora

Books, tools, and gear mentioned in this episode

Tools

More from How I AI

How the founder of Morning Brew built a Claude content machine that never runs out of ideas and never sounds like slop | Alex Lieberman

This solo builder runs 24/7 local AI on his own hardware | Alex Finn

GPT-5.6 Sol vs. Claude Fable: Why OpenAI’s new model crushes my benchmark

What a harness is and how to build one with Claude Agent SDK

How I run autonomous coding agents from my phone with OpenAI Symphony + Linear | Alessio Fanelli (Kernel Labs)

Similar Episodes

SaaStr 839: Why Most SaaS Companies Will Fail at AI (And How to Avoid It) with Intercom's CPO

20Product: Inside Legora's Tech Stack: Why Token Maxing is Failing Enterprise Startups with Jacob Lauritzen, CTO @ Legora

Autonomous Drone Delivery at Scale

Snap’s Secret to Processing 10 Petabytes a Day: GPU-Accelerated Spark | NVIDIA AI Podcast Ep. 298

Head of Growth (Anthropic): “Claude is growing itself at this point” | Amol Avasare

Explore Related Topics

You're clearly into How I AI.