Build Self-Learning Agent with Embeddings and NumPy

Core Agent Loop: Retrieval, Reasoning, and Self-Learning

The agent operates in a feedback loop: for a user query, embed it with OpenAI's text-embedding-3-small model and compute cosine similarity against stored insight embeddings using NumPy to retrieve the top k=3 most relevant texts from memory_store. Feed these as context to gpt-4o-mini (or any OpenAI model) alongside a system prompt defining the agent as a "Data & Analytics Architecture Expert" that explains architectures, recommends patterns, and references past insights in Markdown. Generate a structured response, then pass it to insight extraction: prompt the model to identify a single concise architectural insight or return "NONE". If found, embed and save to vector_store and memory_store. This in-memory setup (cleared on kernel restart) accumulates domain knowledge like "Key components of modern data platforms include ingestion, storage, processing, serving, and governance layers" from initial queries, enabling reuse in follow-ups.

Cosine similarity is calculated as np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)), prioritizing semantic angle over magnitude for robust retrieval even with varying text lengths.

Lightweight Implementation Without Vector DBs

Skip heavy frameworks or databases: use lists for memory_store (insight texts) and vector_store (embeddings as NumPy arrays). Key functions:

save_insight(text): Embed text, append to both stores.
search_insights(query, k=3): If vector_store empty, return ; else embed query, compute similarities, sort with np.argsort(-similarities), return top k texts by indices.
extract_insight(response): Prompt: "Does this contain a useful data/analytics architecture insight? If yes, return as one concise sentence; else 'NONE'." Clean output, save if not "NONE".
run_agent(query): Retrieve insights (display them), build prompt with system role + retrieved context + query, generate response via chat completions, extract and store new insight.

Requires openai, numpy, os; set OPENAI_API_KEY. Run in Jupyter for Markdown display via IPython.display.display(Markdown(response)). Total: ~100 lines, no external deps beyond OpenAI.

Observed Learning in Data Architecture Queries

Start with "What are the key components of a modern data platform?": No retrieval, generates response, saves insight on components (ingestion, storage, etc.).

Follow-up "What architecture patterns are common in modern data platforms?": Retrieves and injects prior insight, enriches response on ELT, streaming, data mesh.

Third: "What scalability practices are important?": Retrieves 2+ insights, builds on them for horizontal scaling, partitioning advice.

Retrieval grows: 0 → 1 → multiple, showing compounding context without manual KB curation. Responses stay focused via prompt: reference past insights explicitly.

Trade-offs and Extensions

Pros: Minimal (NumPy for similarity beats full vector DBs for demos), demonstrates agentic essence—reasoning via LLM + embeddings for memory/learning. Handles domain shift by prompt (e.g., finance via new system role).

Limits: In-memory only (use Chroma/Pinecone/pgvector for persistence); no multi-turn chat history; insight extraction may miss nuances (tune prompt/model); embedding costs scale with interactions.

Scale by persisting stores, adding tools/function calling, or multi-agent swarms. Notebook: https://github.com/Krishsriniv/Domain-Expert-Advisor-AI-Agent.