Mandate Executive Buy-In and All-In Platform Choice to Drive Adoption

Intercom set a goal to double engineering throughput (measured as code changes per R&D person) without growing headcount, launching the full-time "2x" team to guide hundreds of engineers. Leadership updated job descriptions to make AI adoption binary—non-adopters fail expectations—and repeated the urgency message across forums. They rewarded successes publicly in Slack channels, ran AI hackathons and immersion days, and staffed top talent full-time on integration. Critically, they ditched tool sprawl (GitHub Copilot, Cursor, Augment) for Claude Code as the single platform, avoiding "model anxiety" like multicloud setups. This enabled compounding benefits: custom plugins pushed to all laptops, bypassing install issues, treating Claude as a senior engineer onboarded to Rails conventions, React patterns, testing standards, and security rules. Result: flywheel of shared context, skills, and hooks where failures refine guidance collectively.

Build Durable, High-Quality Skills for Every Recurring Task

Focus on small, testable skills over custom multi-agent orchestrators—prioritize lifetime value with backtesting against historical code, incidents, and changes. Skills encapsulate knowledge (e.g., fixing flaky specs in hundreds of thousands of tests using cheat codes and progressive disclosure), self-update via feedback loops from session transcripts mined in S3 and Honeycomb hooks. Connect Claude to production systems with mature controls (audits, permissions) for full scope: debugging, testing, planning, not just autocomplete. Conservative stack (Ruby on Rails monolith) aids this—skills handle CI overload from volume spikes. Example: auto-generated skill for flaky tests matches senior Rails engineer output, built iteratively by giving agent a goal and refining. Continuous improvement leverages vendor ships (e.g., Anthropic) without rebuilding, keeping Intercom ahead.

Give Agents Problems, Not Tasks, to Unlock Autonomy

Shift from micromanaging skills to describing problems—Claude selects tools autonomously. In a security incident (accidental Snowflake metadata in public GitHub repo), Brian described the issue; Claude joined Slack, invoked unknown data breach skill, downloaded files, analyzed per policy, deemed innocuous, and listed steps in 2 minutes (vs. 20 manual). Maturity model progresses engineers: use Claude for everything → automate to skills → write/improve skills → optimize environment (architecture, docs) for agents. This moves humans up the stack, like cloud displacing sysadmins, enabling agent-first SDLC even with current models.

Data Proves 2x Gains with Zero Risk Increase

Post-January rollout (decision Dec), PR throughput doubled in <1 year; auto-Claude PRs hit 90%+, 17.6% auto-approved (SOC 2/ISO 27001/HIPAA compliant, no human loop needed via audited agents and shaped simple/safe PRs). Defects close faster (some teams hit backlog zero); Stanford metrics show rising code quality. CI melted under volume (fixed); skills invoked thousands of times. Viral use beyond eng (consoles, product experiments). Trade-offs: uneven adoption requires maturity guidance; humans weaker than well-defined agents at routine reviews, reducing risk.