DAG Structure Enables Typed, Compositional Tools
Tool-augmented agents face scaling issues: libraries grow but fixed context budgets limit retrieval, and flat text indexing ignores code's typed, hierarchical nature. CoCoDA solves this with a single code-native Directed Acyclic Graph (DAG). Nodes represent primitive (base) or composite (higher-level) tools, storing typed signatures, descriptions, pre/post-conditions, and worked examples. Edges define invocation dependencies, capturing reusable subroutines as composable units. This structure avoids prompt bloat by treating tools as a graph, not a flat list.
Efficient Retrieval Prunes Context via Typed Unification
At inference, Typed DAG Retrieval operates progressively: first prune candidates using symbolic signature unification (matching input/output types); rank survivors by semantic description similarity; filter by pre/post-condition behavioral specs; finally disambiguate with examples on the smallest set. Only viable subgraphs materialize in context, keeping costs sublinear in library size despite growth. Theoretical results prove retrieval cost reduction, sublinear time complexity, and DAG well-formedness preservation.
Training Folds Trajectories into Evolving Composites
Successful agent trajectories distill into new validated composite tools, folding primitives into higher-level nodes. The planner fine-tunes under a DAG-induced reward that credits composites proportional to their primitive expansion size, incentivizing decomposition. Conservative updates ensure monotone co-evolution: performance never regresses as the library expands. This shaped reward yields compositional advantages, where complex solutions emerge from simpler building blocks.
CoCoDA outperforms tool-use and library-learning baselines on mathematical reasoning (GSM8K, MATH), tabular analysis, and code tasks, with an 8B model matching or exceeding a 32B teacher—showing small models scale via structured tool evolution.