Building Complex Software with Long-Running AI Agents

The Shift from Chatbots to Long-Running Agents

Traditional AI coding assistants often fail when faced with complex, multi-step engineering tasks because they rely on single-prompt interactions. When a process breaks, these models typically give up or hallucinate fixes. Long-running agents, by contrast, utilize "goal primitives" that allow them to persist through hours or days of work, self-correcting when they encounter errors in a dependent chain of tasks.

Engineering Complex Systems via Agentic Pipelines

To demonstrate the capability of long-running agents, Addy Osmani highlights two non-trivial use cases:

Operating System Development: An agent was tasked with building a functional OS (dubbed "Adios") featuring a window manager, IndexedDB file system, terminal, file explorer, and a paint application. The agent successfully integrated complex features, including a functional version of the game Doom and a music visualizer, by iterating over several hours.
3D Web Application Optimization: Building a nostalgic 3D video store scene required solving a series of hard technical constraints. Starting with a 156MB Blender file, the agent had to transform it into a browser-ready experience under 10MB. This required a multi-day pipeline to handle:
- Export Pipelines: Writing custom Python scripts to manage Draco quantization and mesh transforms without corrupting geometry.
- Asset Optimization: Automating texture resizing, image compression, and glTF file conversion to ensure fast, lazy-loaded performance.
- Visual Fidelity: Adjusting lighting intensities and material shaders to ensure the browser-rendered output remained faithful to the original 3D scene.

Key Takeaways for Builders

Persistence is Key: The primary advantage of these agents is their ability to work for days without human intervention, handling "dependent problems" that would stall a standard LLM.
Automation of Domain Expertise: Agents can bridge the gap for developers who lack specific domain knowledge (e.g., 3D modeling or Blender expertise) by generating the necessary scripts and pipelines to achieve a professional result.
Constraint-Driven Development: By providing a clear spec, agents can manage the trade-offs between high-fidelity assets and performance requirements (like browser load times), provided the agent is equipped with the right tools to validate its own output.

The Shift from Chatbots to Long-Running Agents

Engineering Complex Systems via Agentic Pipelines

Key Takeaways for Builders

More from AI & LLMs

VS Code Terminal Upgrades Enable Seamless Agent-Terminal Interaction

Z.ai Releases GLM-5.2 with 1M-Token Context for Coding Agents

Moonshot AI Releases Kimi K2.7-Code: Agentic Coding Model

Cohere’s North Mini Code: A 30B MoE Model for Agentic Coding