Deterministic Policy vs. Agentic Judgment
To build production-ready AI systems, the author advocates for a strict architectural split: deterministic rules handle the "boring" majority of operations, while LLMs are reserved exclusively for cases requiring judgment. In an inventory pipeline, standard tasks like VIP preemption (reclaiming reserved stock from standard orders) are defined in YAML. These rules execute with zero LLM calls, ensuring consistency and reliability. The LLM is only invoked when rules fail—specifically, when demand cannot be met and the system must decide how much stock to reorder or how to brief a manager.
The Role of Ontology in Routing
Instead of coupling inventory logic directly to the MCP trigger, the system uses an ontology layer to define routing strategy. The shortage calculation is treated as a data-access helper, while the decision to act on that shortage belongs to an Inventory Agent. This keeps the YAML rules clean and focused on strategy rather than complex mathematical aggregation. The agentic workflow follows a clear pattern: detect the gap, judge the replenishment quantity, and execute the write-back to the ERP (Odoo).
Managing Token Costs and Tool Selection
Large tool catalogs often degrade LLM performance and increase token costs due to the "schema tax" of loading tool definitions into the context window. By using a policy-driven dispatch, the model is never presented with a sprawling menu of tools. Instead, the deterministic engine routes the request to a specific agent, which only exposes the tools necessary for its narrow domain. This approach ensures that token spend scales with the number of actual decisions made, rather than the number of operations performed or tools connected.
Fail-Safe Design
Every agentic judgment is boxed by deterministic constraints. For replenishment, the system calculates a "fallback floor"—a safety buffer—ensuring that if the LLM fails or produces an outlier, the system defaults to a safe, deterministic reorder quantity. Similarly, briefings are generated via templates if the model is unavailable, ensuring that human stakeholders always receive actionable information regardless of model performance.