The Orchestration Tax: Architecting Your Attention

The Human Bottleneck in Concurrent Systems

Modern AI tooling makes it trivial to spawn dozens of agents, creating the illusion of massive productivity. However, this creates a structural mismatch: while agents operate in parallel, human judgment is a strictly serial process. You are the 'Global Interpreter Lock' (GIL) of your AI-powered workflow. Just as Amdahl’s Law dictates that system speedup is limited by the serial portion of the work, your output is capped by your capacity to review, reconcile, and understand the code your agents produce.

The Cost of Cognitive Context Switching

Attempting to manage too many agents leads to 'orchestration tax'—a state where you are perpetually busy but unproductive. Every time you switch between agent outputs, you pay a heavy context-switching cost. Because human brains cannot reload context as efficiently as CPUs, this leads to 'cognitive surrender,' where you stop critically evaluating agent code to save mental energy. This results in the accumulation of both technical and cognitive debt, where you eventually lose your mental model of the codebase, leading to production failures that are difficult to debug.

Architecting Your Attention

To maximize throughput, you must treat your attention as a scarce resource and design your workflow like a distributed system:

Implement Backpressure: Scale your fleet of agents to match your actual review rate, not your UI's capacity. If you can only review three agents properly, do not run twenty.
Categorize Work: Separate tasks into 'isolated' (suitable for background agents) and 'complex' (requiring your full, serial attention). Do not attempt to parallelize complex architectural tasks; they require the 'lock' to be held exclusively.
Batch Your Reviews: Avoid checking agents sporadically. Let work accumulate and process it in batches to minimize the frequency of expensive context switches.
Automate Verification: Shift the burden of proof to the agents. Require them to generate passing tests or visual proof for the 'boring 80%' of work, so you only spend your limited attention on the 20% that requires human judgment.
Protect Serial Time: Recognize that orchestrating agents is overhead, not the work itself. Occasionally, the highest-leverage action is to stop orchestrating entirely and focus on a single problem with your full, undivided attention.

The Human Bottleneck in Concurrent Systems

The Cost of Cognitive Context Switching

Architecting Your Attention

More from Developer Productivity

The Verification Bottleneck: Rethinking Code Review in the Age of AI

2026 AI Coding Agents Ranked by Key Benchmarks

Bun's Fast Runtime Risks AI Agent Pivot

Codex CLI /goal Auto-Compacts Context, Continues Past Usage Limits