Building a Deterministic Runtime for AI Agents

The Failure of Model-Driven Orchestration

Probabilistic systems (LLMs) are ill-suited for deterministic enterprise workflows like payment authorization or vendor onboarding. When models are tasked with orchestration—sequencing tool calls, managing state, and interpreting results—they introduce non-deterministic behavior, security risks, and audit failures. Furthermore, increasing model size or context windows does not solve these issues; it often exacerbates them by increasing the surface area for "lost in the middle" phenomena and tool-space interference, where adding more tools degrades performance.

Shifting the Execution Boundary

The core architectural solution is to move the execution loop out of the model and into a governed runtime. The author introduces "Lattice," an open framework that treats workflows as "capabilities"—typed Python contracts. In this model, the LLM acts only as an intent engine, requesting a high-level outcome (e.g., VendorOnboarding), while the runtime handles the complex, deterministic logic of sequencing, retries, and error recovery.

Capabilities as Governed Code

Capabilities are defined as Python files using decorators to manage the workflow lifecycle:

Contract Definition: @capability defines the inputs and the "projection" (the small, decision-relevant data returned to the model).
Step Execution: @step functions define discrete units of work, allowing for dependency management (e.g., running independent steps in parallel) and granular failure policies (@retry, @soft_failure, @hard_failure).
Security & Audit: Credentials and permissions are managed by the runtime, not the model. By injecting scopes at the capability level, security reviews are simplified from auditing dozens of individual endpoints to auditing a single capability. Every execution generates a structured, queryable audit trail, replacing opaque conversation logs.

Designing for Decision Surfaces

A capability's value is determined by its "projection." Instead of passing raw API responses (which are sensitive and noisy) back to the model, the runtime filters data into a small, typed object. This projection must enable the model to explain outcomes, present clear alternatives upon failure, and map those alternatives to actionable next steps. This approach keeps sensitive data out of the model's context, mitigating risks like prompt injection and log exposure while significantly reducing token costs.

The Failure of Model-Driven Orchestration

Shifting the Execution Boundary

Capabilities as Governed Code

Designing for Decision Surfaces

More from AI Automation

Building a QwenPaw Agent Workspace in Google Colab

Loop Engineering: Moving from Prompting to System Design

Testing Microsoft Fara Browser Agents with Mock Endpoints

Building Production-Grade Multi-Agent Systems with ADK