Claude Code Leak: 12 Primitives for Production Agents

Velocity Risks Exposed by Leaks Demand Boring Primitives

Anthropic's back-to-back leaks—a draft on Claude Mythos and the full Claude Code repo—highlight a core tension in AI development: shipping velocity outpacing operational safeguards. The Claude Code leak, stemming from a build config error (possibly AI-assisted), exposed a $2.5B run-rate product's architecture. While hype focuses on upcoming features, the real value lies in 12 primitives sustaining production agents. These aren't flashy; they're "boring" basics like build validation and publish steps that prevent leaks. Anthropic writes 90% of its code with AI, shipping 5 releases per engineer daily, amplifying config drift risks. Lesson: High-velocity teams must harden primitives without slowing cadence.

"This is the second significant leak from Anthropic in the last few days and it's worth asking ourselves why... is your development velocity outrunning your operational discipline?" – Nate B. Jones, framing leaks as a symptom of unchecked speed in AI-assisted dev.

Tool and Permission Foundations Prevent Demo-Only Agents

Claude Code starts with structural metadata over inference. A dual-registry system—207 user-facing commands and 184 model-facing tools—defines capabilities as dictionaries with names, sources, and descriptions. Implementations load on-demand, enabling runtime filtering and introspection without side effects. No registry means orchestration breaks on every new tool.

Permissions tier risks: built-in (high trust, always-on), plugins (medium, disableable), skills/user-defined (low trust). The bash tool alone has 18 security modules: pre-approved patterns, destructive warnings, sandboxing. Classify actions (read/mutate/destructive), log decisions, add domain checks. Without this, agents can't safely act in production—it's demo territory.

"If your agent can take actions in the world... and you don't have a permissions layer, you have just a demo right you don't have a product." – Jones, distinguishing safe systems from notebooks.

For your stack: Build list_tools() returning metadata first. Pre-classify risks; audit trails for replays.

State Persistence and Budgeting Ensure Crash-Resilient Workflows

Agents crash constantly—tabs close, connections drop. Claude persists full sessions as JSON: ID, messages, tokens, permissions, config. Resume reconstructs the query engine entirely. Separate workflow state tracks steps (planned, awaiting approval, executing), preventing duplicates on retry.

Token budgets enforce hard limits: max turns, projected usage halts before API calls with structured stops. Compaction thresholds trim history, prioritizing recent entries. This avoids runaway costs, building trust like Amazon's returns policy.

"Anthropic being a really responsible citizen here and saying 'We don't want you to have runaway budget spending that you do not clearly intend it's the same way that Amazon enables returns which may not be good for Amazon in the short term but increase customer trust.'" – Jones, on why self-imposed limits pay off long-term.

Your implementation: Persist post-events, not just shutdown. Model workflows explicitly; checkpoint states like 90s savegames. Track input/output tokens with projections.

Streaming, Logging, and Verification Build Observability

Structured streaming turns every event into user insight: message start, tool match, token counts, crash reasons (black-box style). Users intervene mid-thought via streams. System logs capture non-conversational actions: context loads, routing, permissions—categorized for enterprise audits.

Verification doubles up: agent self-checks post-run, plus harness tests (e.g., destructive tools need approval? Graceful token halts?). Evolving harnesses demand guardrail regressions.

"The conversational transcript needs to tell the user what the agent did not just what it said." – Jones, on why event logs are enterprise-essential.

Apply: Emit typed events (tool deliberations, crashes). Log actions separately; test harness changes against named guardrails.

Scaling Patterns: Dynamic Pools and Agent Typing

Operational maturity shines in tool pool assemblies: From 184 tools, assemble session-specific subsets via flags/denylists for efficiency. Transcript compaction auto-trims after turns, preserving instructions.

Permissions as queryable objects serve contexts: interactive (human-in-loop), coordinator (multi-agent), swarm (autonomous). Six agent types—explore, plan, verify, guide, general, status—each with prompts/tools/constraints.

"Claude Code defines six built-in agent types... each of these agent types comes with its own prompt its own allowed tools its own behavioral constraints." – Jones, revealing typed specialization unseen before.

For general agents: Dynamic subsets over hardcodes. Multi-context handlers. Type agents for reuse.

Author releases two skills: Generic agent assessor (gap-analysis vs. primitives), Claude Code-tuned for cross-pollination.

Key Takeaways

Define tools in metadata registries (name/desc/source) before code; enable filtering/introspection.
Tier permissions (high/medium/low trust) with 18-module security for risky tools like shell exec.
Persist full sessions (msgs/tokens/permissions/config) and separate workflow states for crash recovery.
Enforce token budgets with projections/halts; auto-compact transcripts post-threshold.
Stream typed events (tools/tokens/crashes); log system actions for audits.
Verify agent runs and harness changes via guardrail tests.
Assemble dynamic tool pools per session; type agents (explore/plan/etc.) for specialization.
Audit permissions as state objects across contexts (interactive/multi-agent/autonomous).
Harden ops primitives (build validation) to match AI dev velocity.
Use leaks like this for primitives, not hype—sustains $2.5B-scale agents.