Build Local AI Knowledge Base with OpenKB & Llama

Use OpenKB to turn Markdown docs into a searchable wiki: install tool, add free Llama via OpenRouter securely, ingest docs, auto-generate summaries/concepts, query, lint, analyze links, update incrementally—all in Python/Colab.

Secure LLM Integration Without Hardcoded Secrets

Start by installing OpenKB via pip install openkb --quiet in a Colab-like environment. Use getpass to input your free OpenRouter API key securely—never print or hardcode it. Set environment variables:

os.environ["OPENROUTER_API_KEY"] = OPENROUTER_API_KEY
os.environ["LLM_API_KEY"] = OPENROUTER_API_KEY
LLM_MODEL = "openrouter/meta-llama/llama-3.3-70b-instruct:free"

This configures Llama 3.3 70B (free tier, no credit card) for all operations. Create a KB directory (/content/my_knowledge_base) with subfolders: wiki/sources, wiki/summaries, wiki/concepts, etc. Write config.yaml specifying model/language and .env for keys. Principle: Environment isolation prevents leaks; free models lower barriers for prototyping.

Common mistake: Hardcoding keys exposes them in git/logs. Avoid by using getpass and .env.

Ingesting Documents to Generate Linked Wiki Pages

Prepare raw Markdown docs in raw/ (e.g., on Transformers, RAG, KGs). Run openkb add <doc_path> per file. OpenKB uses the LLM to:

  • Create summaries/<doc>.md: Concise overviews.
  • Extract concepts/*.md: Cross-doc syntheses with [[wikilinks]].
  • Update index.md (overview), log.md (timeline).

Example docs cover Transformer components (self-attention, positional encoding), RAG pipeline (index/retrieve/generate), KG integration (triples, GraphRAG). Output: Auto-linked Markdown wiki. Inspect with tree view:

def show_tree(root: Path, indent=0, max_depth=3): ...
show_tree(wiki_dir)

Quality criteria: Pages use standard template (## Overview, ## Key Points, ## Related Concepts, ## Sources). Wikilinks enable navigation. Before: Raw isolated docs. After: Interconnected wiki with hubs like [[Transformer]].

"Each document is read by the LLM, which writes summaries + concept pages."

Querying for Synthesis and Saving Explorations

Use openkb query "<question>" for grounded answers drawing from wiki. Examples:

  • "What is the Transformer architecture?" → Details self-attention, residuals.
  • "Connections between KGs, RAG, transformers?" → Structured reasoning over relations.

For deep queries, add --save to store in explorations/*.md:

openkb query "Synthesise key architectural themes..." --save

This creates persistent, linkable analyses. Run openkb list/status for inventory; openkb lint flags issues (orphans, contradictions, gaps) via reports in reports/*.md.

Principle: Queries aren't one-offs—save for iterative refinement. Trade-off: Free model may hallucinate less with grounding but slower than paid.

"Synthesise the key architectural themes across transformers, RAG, and knowledge graphs into a unified mental model."

Programmatic Inspection of Wiki Graph Structure

Beyond CLI, parse wiki in Python: Glob *.md, extract wikilinks with re.findall(r'\[\[(^\]]+)\]\]', content), count lines/links.

wiki_pages = {}
for md_file in wiki_dir.rglob("*.md"):
    rel = str(md_file.relative_to(wiki_dir))
    content = md_file.read_text()
    links = re.findall(r'\[\[(^\]]+)\]\]', content)
    wiki_pages[rel] = {"lines": len(content.splitlines()), "wikilinks": links}

link_targets = Counter(link for m in wiki_pages.values() for link in m["wikilinks"])

Visualize hubs (most-linked pages), cross-refs. Reveals structure: e.g., [[Attention]] as hub. Criteria for healthy wiki: Balanced links, no isolates, growing concepts.

"🏆 Most-referenced wiki pages (hub concepts):"

Incremental Updates Without Full Rebuilds

Add new docs anytime: openkb add sparse_attention.md (on Longformer, FlashAttention). Triggers re-generation of affected summaries/concepts. Before: 3 concepts; after: +new ones linking to RAG/Transformers. Log tracks changes.

Principle: Supports evolving corpora. Trade-off: Frequent adds increase compute; batch for efficiency.

Exercise: Add your docs (e.g., custom research), query multi-hop, lint, graph-analyze.

Assumes: Python basics, Markdown familiarity, API key from openrouter.ai. Fits in RAG/agent pipelines as local grounding store.

"Adding: sparse_attention.md" → "💡 Concept pages: 3 -> 5"

Key Takeaways

  • Install OpenKB and use getpass for secure OpenRouter free Llama setup—avoids secrets in code.
  • Initialize KB with config.yaml/ .env; mkdir wiki subdirs for structured output.
  • Ingest Markdown via openkb add: Auto-creates summaries, concepts with [[wikilinks]].
  • Query with openkb query; save deep ones via --save for explorations.
  • Lint (openkb lint) catches gaps/orphans; parse wikilinks in Python for graph insights.
  • Update incrementally: openkb add new_doc evolves wiki live.
  • Inspect: list/status for overview, tree/md viewers for details.
  • Free models like mistral-7b-instruct:free swap in via LLM_MODEL.
  • Builds grounded querying beyond vanilla RAG: Wiki + links + synthesis.
  • Prototype in Colab; scale to prod with paid models/local LLMs.

Summarized by x-ai/grok-4.1-fast via openrouter

9276 input / 2862 output tokens in 21662ms

© 2026 Edge