AutoAgent Optimizes Harnesses Like Karpathy's Auto-Research
Extend Karpathy's auto-research loop—edit code, run 5-min evals, keep improvements—to agent harnesses (prompts/tools) via meta-agents, yielding domain-specific agents overnight on benchmarks like SpreadsheetBench.
Core Self-Improvement Loop: Edit, Eval, Iterate Overnight
Karpathy's auto-research uses a simple setup with one GPU and 5-minute training runs: fix data prep/tokenizer (prep.py), let an agent edit training code (train.py) for model, loop, hyperparameters, then evaluate per human instructions in program.md. If metrics improve, commit changes; else revert. Humans "program in natural language" via program.md, agent handles code. Run overnight for real gains without manual coding.
AutoAgent applies identical loop to agent harnesses instead of ML training: meta-agent edits task agent's prompts, tools, orchestration (agent.py), runs evals on benchmarks via adapters, commits improvements based on results and reasoning traces. Starts with minimal bash tool; discovers domain-specific logic autonomously.
Architecture Enables Parallel, Domain-Agnostic Optimization
Split into meta-agent (orchestrates iterations, spins thousands of parallel sandboxes) and task agent (executes domain tasks). Connects to any benchmark (e.g., SpreadsheetBench, TerminalBench) for verification. Same files as auto-research: program.md for human guidance on goals/avoidances, agent.py as editable target.
Simplicity mirrors Karpathy: no complex infra needed. Meta-agent reads traces/results post-sandbox runs, decides keeps/reverts, builds specialized tooling/verification/orchestration nobody coded manually.
Benchmark Gains and Harness Engineering Trade-offs
On SpreadsheetBench/TerminalBench, iterations show harness improving: better prompts/tools yield higher scores, compounding overnight. Enables cheaper, specialized agents per domain/workflow vs. monolithic harnesses.
Harness optimization critical because domains need tailored prompts/tools (e.g., spreadsheets vs. terminals), requiring domain+model expertise. Companies gain from stack-specific harnesses running smaller models. Future: domain experts write program.md, meta-agents auto-engineer harnesses—like AI now writes code—define success, return in 24h with optimized setup.