Symphony: Orchestrator Layer Scales AI Agents Past Human Bottlenecks

Harness Engineering Replaces Prompting as AI's Core Work

AI models act like CPUs—great at reasoning and output generation—but handle only narrow tasks. The harness infrastructure around them manages memory, sub-agents, tool execution, chat history, and more, doing the bulk of the work. As agents scale, humans become the bottleneck, shifting engineering from prompts to scaffolding.

Divide harnesses into inner (built into tools like Claude Code, Cursor, or Codex: sub-agents, sandboxing, tools) and outer (custom code controlling lifecycle: terminate sessions, clear context, inject files from disk). Metaprompting frameworks like Superpowers, GSD, or BMAD improve first attempts but fall short for reliability.

Use guides (feedforward: agent.md files, skills, playbooks, examples) to steer agents initially. Add sensors (feedback): deterministic computational ones (linters, type checks, schemas—underused by builders) run without AI, feeding failures back. Inferential sensors use LLMs as judges (e.g., different model reviews code). This cybernetic loop regulates toward desired states, as in external Ralph Wigum loops spawning sessions until goals met (e.g., human approval). Examples: Gas Town for parallel loops; Archon for custom workflows with parallelism.

Harnesses span deterministic (fixed workflows, e.g., contract review with doc checks) to probabilistic (open-ended research with citation validation, multi-LLM reviews). OpenAI reports 500% increase in landed pull requests via such systems.

Symphony's Orchestrator Layer Enables Multi-Agent Scale

Build atop harnesses with an orchestrator/scheduler layer for multi-agent coordination. Symphony turns issue trackers like Linear into triggers: open tickets spawn isolated agent workspaces (e.g., Codex in app server mode via CLI), running state machines until done. Humans interact at high abstraction via tickets, not tab-supervision; less technical staff can participate.

Solves parallel agent issues: clashing (isolate workspaces) and human-in-loop (tickets for oversight without micromanaging). GitHub repo is mostly spec.md—prompt your agent to implement in any language against any coder (even Claude). Reference Elixir prototype uses Linear API to pull tickets.

Apply Layers to Production AI Apps

Extend to custom apps: core agentic system as inner harness; outer adds guides/sensors (e.g., automated doc checks + LLM judge in contract review). Avoid chaos in parallel setups by blurring lines thoughtfully—e.g., Gas Town orchestrates multiple Ralph loops. Future AI engineering prioritizes scaffolding over prompting for reliable autonomy.