The Problem with Stateless Multi-Agent Systems

Most current multi-agent frameworks for code generation are fundamentally stateless. They treat every problem as a fresh start, discarding the valuable experience gained from previous attempts, debugging sessions, and test failures. This lack of memory limits their ability to handle the rigorous, multi-step reasoning required for competitive programming, where edge cases and complex logic often cause standard LLMs to fail.

The Solvita Framework: A Closed-Loop Evolution

Solvita introduces a stateful, agentic evolution framework that enables continuous learning without modifying the underlying LLM weights. The system operates through four specialized agents that function as a closed-loop pipeline:

  • Planner: Determines the high-level strategy for solving a problem.
  • Solver: Executes the actual program synthesis based on the plan.
  • Oracle: Provides certified supervision to validate the logic and code quality.
  • Hacker: Actively searches for adversarial vulnerabilities or edge cases that might break the generated code.

Accumulating Experience via Knowledge Networks

Each agent in the Solvita framework is paired with a trainable, graph-structured knowledge network. As the agents interact, the system captures outcome signals—such as pass/fail verdicts, test certification quality, and vulnerabilities identified by the Hacker. These signals are converted into reinforcement learning updates for the knowledge network weights.

This architecture allows the agents to dynamically route future queries based on historical performance. By effectively "remembering" what worked and what failed in previous tasks, the system accumulates transferable reasoning experience over time. This approach has proven highly effective, outperforming existing multi-agent pipelines and nearly doubling the accuracy of single-pass baselines across benchmarks like CodeContests, APPS, AetherCode, and live Codeforces rounds.