Semantic Memory Storage for Contextual Recall

Transform raw task data—thoughts, actions, observations—into vectors using the all-MiniLM-L6-v2 embedding model. Structure as Trajectories to capture full causality chains, not just outcomes. Retrieve relevant memories via cosine similarity on embeddings, matching conceptually similar experiences despite wording differences. This replaces vague logging with precise, vector-based recall, allowing the agent to reference exact failure paths like deprecated API_V1 errors.

Decoupled Planner-Executor Loop for Adaptive Planning

Separate planning from execution: the Planner (powered by Grok-4.1-fast-reasoning or GPT-4.1) ingests the current task plus top retrieved successful strategies and past mistakes. It synthesizes improved plans, e.g., switching to API_V2 after reviewing a prior failure trajectory. The Executor interfaces with the environment, returning deterministic feedback (success/error) to log new trajectories. This decoupling ensures planning evolves based on execution outcomes without contaminating the reasoning model itself.

Failure-Driven Optimization Reduces Steps to Success

Force an initial failure (e.g., API_V1 call) to populate memory, then observe how retrieved failure data guides the Planner to correct actions on similar tasks. Without this, static LLMs repeat errors if training data ambiguates options; with Memento, agents learn experientially, dropping average steps to success from multiple trials to one. In industrial settings, this enforces accountability by preventing repeat documented failures, enabling sovereign AI with private, persistent intelligence. Full Python implementation in MEMENTO_GROK.ipynb demonstrates the loop end-to-end.