Modular Agent Architecture
To move beyond simple chat interactions, structure your agent as a pipeline of specialized roles. This separation of concerns improves reliability and debuggability:
- Planner: Responsible for high-level strategy. It takes the user goal and outputs a structured JSON object containing the objective, a list of sequential steps, and potential tool checkpoints.
- Executor: The engine that performs the work. It operates in a loop, calling tools as needed and maintaining a trace of all actions. It keeps intermediate notes to ensure the model stays grounded in the current task.
- Critic: The quality control layer. It reviews the executor's draft against the original goal and the execution trace, identifying issues and generating a polished final output.
Tooling and State Management
For an agent to be useful, it must interact with the environment reliably. Use structured tool definitions and machine-readable outputs to minimize errors:
- Tool Schema: Define clear function schemas (e.g.,
calc,kb_search,write_file) so the model can reliably invoke Python functions. - State Tracking: Use a
dataclassto maintain theAgentState, which stores the goal, memory, and a fulltraceof tool calls. This trace is critical for debugging and allows the critic to understand exactly how the agent arrived at its draft. - Structured Outputs: Ensure tools return dictionaries with an
okstatus and relevant data. This prevents the agent from hallucinating tool results and makes it easier to handle errors gracefully.
Implementation Workflow
- Initialization: Set up the OpenAI client and define a persistent
AgentState. - Planning: Prompt the model to generate a structured plan in JSON format. If parsing fails, provide a fallback mechanism to proceed directly.
- Execution Loop: Run the executor for a fixed number of iterations (e.g., 12). In each step, check for tool calls, execute them in Python, and append the results back to the message history so the model can adjust its next move.
- Critique & Finalization: Pass the draft and the execution trace to the critic to generate the final deliverable. This ensures that even if the executor makes minor errors, the critic can catch and fix them before the user sees the result.