Decouple Agent Brain from Hands for Scale

Managed Agents uses stable interfaces for session (event log), harness (Claude loop), and sandbox (execution env) to let implementations evolve independently as models improve, cutting p50 TTFT 60% and p95 over 90%.

Virtualize Agent Components Like OS Abstractions

Treat agent parts as virtualized hardware: session (append-only event log), harness (Claude + tool routing loop), and sandbox (code execution env). This mirrors OS design where read() works across disk packs to SSDs—interfaces stay stable while implementations swap freely. Harnesses encode model assumptions like context resets for "context anxiety," which Claude Sonnet 4.5 needed but Opus 4.5 didn't, turning fixes into dead weight. Stable interfaces outlast harness changes, supporting future harnesses like Claude Code or task-specific ones without rework.

Provision sandboxes via provision({resources}) and execute with execute(name, input) → string. Recover harness crashes by wake(sessionId) then getSession(id) to resume from last event via emitEvent(id, event). Query session history with getEvents() for positional slices, enabling rewind or selective reread outside Claude's context window.

Make Everything Cattle, Not Pets

Initial single-container design coupled session, harness, and sandbox, creating fragile "pets": container failure lost sessions, debugging mixed harness bugs with network drops, and VPC integration forced network peering or self-hosting. Decouple brain (harness + Claude) from hands (sandboxes/tools) and session: harness calls sandboxes as tools, treating failures as tool errors for Claude to retry on fresh containers.

This eliminates nursing unresponsive containers—harness becomes stateless cattle too. Performance win: brains start inference immediately via session pull, provisioning hands only on-demand. Result: p50 time-to-first-token (TTFT) drops 60%, p95 over 90%, as sessions skip container setup (repo clone, boot) if unneeded. Scales to many brains (stateless harnesses) and hands (each an execute tool), letting Claude route across VPCs, custom tools, MCP servers without assumptions.

Secure Credentials and Recoverable Context

Security: Keep credentials out of sandboxes where Claude runs untrusted code. For Git, clone repos with repo tokens at init, wiring to local remotes—no token exposure. Custom tools use MCP proxy: session-linked token fetches vault creds, hiding from harness. Mitigates prompt injection scaling with model smarts.

Context for long-horizon tasks: Session log as durable external object beats irreversible compaction/memory tools/trimming, which risk discarding needed tokens. Harness transforms fetched events (e.g., organization for prompt cache hits) before Claude's window, separating storage durability from evolving management. Avoids REPL objects by using session getEvents() for flexible interrogation, enabling learning across turns without loss.

Summarized by x-ai/grok-4.1-fast via openrouter

8038 input / 2104 output tokens in 12577ms

© 2026 Edge