Build Stateful Agents with File Systems & AI SDK v6

Agent Runtime: Tool-Loop as the Core Harness

The foundation of effective agents in 2026 is a lightweight runtime that manages the tool loop, context persistence, and execution harness. Nico Albanese teaches building this with Vercel's AI SDK v6's toolLoopAgent, a two-line abstraction over primitives like generateText and streamText. Define the agent once in lib/agent.ts for reuse across Next.js routes, Bun servers, or monorepos:

import { toolLoopAgent } from 'ai/toolu'; // Note: actual import from '@ai-sdk/core'
import { ollama} from 'ai/providers/ollama';

export const agent = toolLoopAgent({
  model: 'gpt-4o-mini',
  instructions: 'Your system prompt here',
  tools: { /* tools later */ }
});

This keeps LLM logic centralized, avoiding 2,000-line route handlers. Call it in app/api/chat/route.ts with createAgentUIStreamResponse(agent, { messages }) for streaming. On the frontend, useChat from @ai-sdk/react handles message state, errors, and UI rendering. Prerequisites: Next.js app, Vercel CLI linked to a project for OIDC tokens authenticating AI Gateway and sandboxes. Install deps: pnpm add ai @ai-sdk/react zod. Run pnpm dev to test basic chat at localhost:3000.

Key principle: Instructions shape behavior early. Update instructions to "Respond like a cowboy" and refresh—agent replies "Howdy partner 🐴". This evolves into complex directives for planning, tool use, and persistence.

Common mistake: Inline tools/prompts in API routes, leading to bloat. Solution: Agent definition as single source of truth.

Context Augmentation: Provider-Executed Tools

Agents fail without external context. Start with OpenAI's web search as a provider-executed tool—no custom execute function needed. Install ai-sdk-openai, import { openai }, add to agent:

tools: {
  webSearch: openai.tools.webSearch({ /* optional params */ })
}

OpenAI executes search server-side, injects results into messages. Query "When is AI Engineer Summit London?"—agent pauses, searches, responds with dates. Customize: Pass { location: 'London' } for localized results. Trade-off: Provider lock-in (OpenAI-specific), but zero code for quick wins.

Three tool types explained:

Custom tools: Define description, Zod schema, execute fn (e.g., bash later).
Provider-defined: Pre-trained like Anthropic's bash/computer-use tools.
Provider-executed: Infra-handled like web search.

UI feedback is crucial—users see nothing during loops. Use typed messages for rendering.

End-to-End Type Safety from Agent Definition

AI SDK v6 infers types across stack from agent tools. Export type AgentUIMessage = InferAgentUIMessage<typeof agent>;. In route handler:

const response = await createAgentUIStreamResponse(agent, {
  messages: messages as AgentUIMessage[]
});

In page.tsx, const { messages } = useChat({ api: '/api/chat' } as UseChatParams<AgentUIMessage>);. Now part.type === 'tool-web-search' autocompletes input.query: string, output.results: array. Build UI:

{part.type === 'tool-web-search' && (
  <div>🔍 Searching: {part.input.query}... {part.status === 'pending' ? '⏳' : '✅'}
    {part.output?.results?.slice(0,3).map(r => <p>{r.title}: {r.content}</p>)}
  </div>
)}

Before: unknown types, manual casting. After: Full autocomplete for inputs/outputs/status. Quality criteria: Agent tools dictate UI—add tool, types propagate. Fits mid-workflow after basic agent, before persistence.

Persistent Sandboxes: The Computer That Changes Agent Behavior

Key insight: File systems transform agents from hallucinating short-task bots to persistent task-followers. Vercel's internal DZero agent (Slackbot accessing Vercel admin, Salesforce) exploded in reliability post-filesystem: Created plan.md with objective + steps, checked off progress, stored research in dirs. No more context dilution in long windows.

Vercel Sandboxes: Named, persistent file systems per agent run. Init via CLI: vercel sandbox init my-sandbox --persistent. Mount in agent calls with custom options:

const sandbox = await vercel.sandbox({ name: 'agent-computer' });
agent.call({ ..., sandbox });

Agents read/write files (e.g., memories.md), execute bash. Behavior shift: Builds on prior work across sessions, no manual memory. Trade-off: Sandbox isolation limits (no network by default), but secures execution.

Instructions enforce: "For every session: 1. Read plan.md/objective. 2. Update scratchpad. 3. Generate/store Python scripts for repeats. 4. Use bash for execution."

Common mistake: Ephemeral context—agents forget mid-task. Avoid: Mandate file-based planning/checklists.

Custom Tools: Bash Execution and Learning via Scripts

Add bash tool for sandbox compute. Custom tool schema:

const bashTool = tool({
  id: 'bash',
  description: 'Execute bash commands in sandbox',
  parameters: z.object({ command: z.string() }),
  execute: async ({ command }, { sandbox }) => {
    const result = await sandbox.execute(command);
    return { stdout: result.stdout, stderr: result.stderr };
  }
});

Integrate: tools: { bash: bashTool, webSearch }. Agent now runs ls, echo 'test' > file.txt. Add memories.md: Agent appends insights, reads on start.

Advanced: Self-improvement loop. Instruct: "For repeatable tasks, generate Python script, save as tools/script.py, execute via bash python script.py". Agent accumulates tools/context autonomously.

Sub-agents: Delegate via nested toolLoopAgent. Full system: Web search → plan → bash/script gen → persist.

Quality check: Output includes artifacts (files, scripts). Practice: Clone repo (AIE-London-demo), iterate instructions.

Key Takeaways

Define agents with toolLoopAgent for reusable, lightweight runtimes—centralize tools/instructions.
Use provider-executed tools like OpenAI webSearch for instant context without code.
Leverage end-to-end types: InferAgentUIMessage ensures UI/tools stay in sync.
Give agents persistent sandboxes: Changes behavior from flaky to task-persistent via file plans.
Combine bash + memories.md + script gen: Enables cross-session learning, no manual state.
Always instruct file-based planning: "Create/update plan.md with objective + steps" prevents drift.
Trade-off honesty: Provider tools tie you in; sandboxes secure but limit network.
Test iteratively: Start cowboy prompt, layer tools/sandboxes.
Production: Link Vercel project, pull OIDC env vars for auth.

Notable Quotes:

"Giving an agent a file system didn't just add storage, it changed how the agent behaved. It started following through on long tasks, staying on track, and building on its own prior work." (Intro insight on DZero agent.)
"The instructions there were create this plan file and in that plan file was the objective right at the top and then right below the instructions were follow this plan file to a T, check things off as you go." (Explaining persistence shift.)
"This is quite obvious as to what it does it is a tool loop using agent um kind of does what it says on the tin." (On toolLoopAgent naming/simplicity.)
"The big assumption goes through every single AI SDK API decision is like we want to have the agent definition being the source of truth that everything kind of inherits from." (Type safety philosophy.)
"Agents in 2026: agent runtime, tools, and a computer/sandbox for persistence." (Core building blocks.)

Agent Runtime: Tool-Loop as the Core Harness

Context Augmentation: Provider-Executed Tools

End-to-End Type Safety from Agent Definition

Persistent Sandboxes: The Computer That Changes Agent Behavior

Custom Tools: Bash Execution and Learning via Scripts

Key Takeaways

More on Edge

MMX-CLI Unlocks Multimodal AI via Shell Commands

Embed Pi Coding Agents via CLI Tools in Products

Clone Lib Repos to Make Agents Master Effect Patterns

Claude Code Leak Reveals Advanced Agentic Architecture