AI Summary
→ WHAT IT COVERS Brian Scanlan, Senior Principal Engineer at Intercom, details how the company doubled engineering throughput (measured in merged PRs per R&D head) over nine months using Claude Code. He demonstrates the internal skills repository, telemetry infrastructure, session analysis tooling, and cultural frameworks that enabled a 150+ person R&D organization to ship at 2x velocity while maintaining or improving code quality. → KEY INSIGHTS - **Velocity Measurement:** Use merged pull requests per R&D head as a leading indicator of AI adoption effectiveness. Intercom tracked this metric from baseline through a 9-month Claude Code rollout, achieving 2x throughput. The raw PR count grew even higher since headcount also increased during this period. A crude metric beats no metric when building organizational accountability around AI tooling adoption. - **Skills Distribution via IT Systems:** Deploy Claude Code plugins through internal IT infrastructure rather than relying on Claude's native plugin sync mechanism, which proved unreliable across hundreds of laptops. Pushing skill files directly to disk via IT management tools eliminates version drift, reduces debugging overhead, and ensures every engineer runs identical, current tooling without manual intervention or update failures. - **LLM Judges for Quality Regression Detection:** After Claude Code began generating low-quality PR descriptions (summarizing code rather than intent), Intercom built an LLM judge to evaluate months of historical PR description data. The judge confirmed a downward trend, prompting a mandatory "create PR" skill enforced via hooks that block the GitHub CLI. Post-intervention, the LLM judge confirmed quality returned to above-baseline levels. - **Session Telemetry for Org-Level Diagnostics:** Collect Claude Code session JSON files, anonymize them, upload to S3, and build user-level dashboards showing session efficiency percentiles, skill invocation patterns, and dropout rates. This surfaces systemic problems—like an MCP never triggering correctly—that are invisible without aggregate data. Honeycomb works well for real-time skill invocation tracking across the engineering organization. - **Self-Improving Skills via Feedback Loops:** Build skills that update themselves when they encounter novel solutions. Intercom's flaky spec skill fixes a test, documents the new pattern back into the skill file, then fans out to find all similar failing tests. This compounds from roughly 1x performance at launch to 10x or higher as the skill accumulates domain-specific patterns, without requiring ongoing human maintenance. - **Tech Debt as AI Onboarding Strategy:** When introducing AI coding tools to an engineering team, direct engineers to spend one month fixing everything they hate about the codebase. The combination of low-friction execution and high emotional payoff builds AI tool fluency while delivering measurable quality improvements. Intercom migrated an entire Go microservice to Ruby in a single Claude Code session—previously a multi-month roadmap item requiring organizational consensus. → NOTABLE MOMENT Scanlan described how Intercom's CI system became ten times more expensive almost overnight once Claude Code adoption accelerated PR volume. After fixing those infrastructure bottlenecks, code review became the new constraint. The implication: AI coding tools will sequentially expose every weak point in a delivery pipeline, requiring teams to fix bottlenecks they previously never stressed. 💼 SPONSORS [{"name": "Celigo", "url": "https://celigo.com/howiai"}, {"name": "Cursor", "url": "https://chatprd.ai/howiai"}] 🏷️ Claude Code, Engineering Velocity, AI Coding Tools, Developer Productivity, Internal Developer Platforms, Technical Debt
