Karpathy's LLM Wiki: Self-Healing Knowledge Base
Compile raw sources into a markdown wiki using LLM as compiler: ingest updates 10-15 pages per article, query files answers back, lint fixes contradictions—scales 100 articles to 400k cross-linked words without vector DBs.
Replace Note Graveyards with LLM-Compiled Wikis
Traditional note-taking fails because humans forget 70% of new info within 24 hours (Ebbinghaus curve), and knowledge workers waste 1.8 hours daily searching existing notes—equating to $20,000/year per person in lost productivity amid 80% reporting info overload. Tools like Notion or Obsidian become graveyards: clips and highlights pile up but stay unfindable after 6 months. RAG (Retrieval-Augmented Generation) worsens this by speeding retrieval from messy dumps without building understanding—most implementations never hit production as conversations reset, leaving knowledge siloed.
Karpathy's fix: treat raw sources (articles, PDFs, datasets) as source code, LLM as compiler, and output wiki as executable. Drop a source; LLM extracts insights, writes summaries, and updates interconnected markdown pages with entities, cross-references, and backlinks. Result: structured knowledge that compounds, where one article touches 10-15 wiki pages, turning 100 inputs into 400,000+ words of aware, queryable content. No vector DB needed—just plain markdown, Git for versioning, and a config schema defining entities/rules. This beats RAG by creating persistent, evolvable documents you read directly.
Three-Layer Stack Enables Human-LLM Division
Layer 1 (Sources): Immutable raw files—drop and forget. Layer 2 (Wiki): LLM-generated markdown with entity pages (e.g., concepts/people), summaries, index mapping everything, and log tracking changes. Layer 3 (Schema): Config file specifies tracked entities, page structures, and maintenance rules—tells LLM what to enforce.
You edit/curate inputs; LLM handles synthesis and upkeep as librarian. Karpathy's gist (5k+ stars) runs this personally with minimal intervention. Daniel Miessler's Fabric (40k+ stars) mirrors it independently: captures signals into persistent layers, learns from failures. Pattern: second brains evolve from passive collection to LLM-maintained evolution.
Ingest, Query, Lint: Operations That Snowball Value
Ingest: Add source → LLM summarizes, creates/updates 10-15 related pages (e.g., entities gain new cross-refs/backlinks). Snowballs: 10 sources yield 40 pages; 50 yield 200; 100 yield domain-expert knowledge exceeding human memory.
Query: Ask question → LLM scans index, reads pages, synthesizes answer, then files valuable responses as new wiki pages. Every interaction enriches the base.
Lint: Automated health checks flag contradictions, stale claims, orphan pages (no inbound links), missing cross-refs—wiki self-heals. Run periodically; contradictions surface as LLM spots inconsistencies across pages.
Compounding impact: wiki becomes research partner knowing your domain deeply. Start small: pick topic, ingest one source via Karpathy's prompt, add more, watch interconnections emerge.
Applications Across Domains
Research: Synthesize 100 papers into one cross-referenced wiki. Health: Track goals, diet, supplements, exercise with auto-links. Business: Compile Slack/Confluence/competitor intel into reasoning-ready base—every firm has raw dirs; compiling them is the product. Curate inputs strictly; LLM maintains. Open gist lets anyone replicate: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f.