Code Mode: LLMs Generate Executable Code for Agents

Ditch JSON tool-calling for LLM-generated JavaScript code execution in capability-based sandboxes to handle 2600+ APIs in 1000 tokens (99.9% reduction), manage state/loops/parallelism, and enable generative UIs/workflows.

Replace Tool Calls with Code for Massive APIs and Efficiency

Traditional tool-calling fails at scale: stuffing 2600 Cloudflare API endpoints into tools consumes 1.2-1.5 million tokens and requires 8+ round trips for tasks like blocking DDoS IPs. Instead, expose two code-accepting tools—search (over OpenAPI JSON spec) and execute—shrinking prompts to 1000 tokens, a 99.9% reduction. The LLM generates JavaScript once, leveraging loops, state, sequencing, and parallelism natively. For a panicked user prompt like "find offending IPs attacking us and block them," it searches endpoints, generates pagination-aware code (e.g., listing Workers via single API call), and executes near the API surface in one shot, bypassing cumbersome dashboards.

This typed, syntax-checked code runs error-free where JSON schemas break on composition. Models, trained on terabytes of code, excel here, eliminating back-and-forth latency.

Build Harnesses with Capability-Based Security

Create a 'harness'—a capability-less sandbox (V8 isolates, WASM, custom JS interpreter) that starts with zero powers (no fetches, APIs) and grants explicit APIs for observability and speed. Control all outgoing network; prefer no fetches, only internal APIs. Use fast-startup isolates (10 years security-hardened) for ephemeral runs.

Programmers script file categorization/renaming via IDE; LLMs democratize this for non-technical users by generating/running code safely (e.g., "rename files by date/location"). In Kenton Jackson's canvas demo (TLDraw/Excalidraw-style), no tic-tac-toe code exists—LLM inspects stroke array state, recognizes board, draws O response, inhabiting the state machine emergently.

Unlock Stateful Workflows and Generative UIs

Extend to long-running (days/months) stateful workflows per instance. Generate per-user UIs from context: e-commerce surfaces custom actions for "return shoes, find similar <$100" or "track delayed order," ditching bland interfaces. Blank chat becomes dynamic canvas; run harnesses client-side (e.g., iPhone) to mash services task-by-task.

This shifts from app-generation to state-inhabitation, rethinking 30-year UI paradigms now eval() is safe. UI devs thrive building these.

Design Agent DX with Code-Native Interfaces

Your next users are code-generating agents in registries, not pubs. Optimize docs (Markdown/searchable), errors (guide next steps), discoverability. Embed capability-based security across langs (JS/Python/WASM/Lisp). Humans get buttons; agents get code power—expose systems via typed APIs, let code interact.

Summarized by x-ai/grok-4.1-fast via openrouter

7356 input / 1331 output tokens in 15050ms

© 2026 Edge