LLM-Maintained Wikis Beat RAG for Knowledge
Have LLMs build and update a persistent, interlinked markdown wiki from your sources—instead of rediscovering facts via RAG every query. Knowledge compounds over time.
Persistent Wiki Replaces Rediscovery in RAG
Standard RAG setups—uploading docs to NotebookLM or ChatGPT—force the LLM to hunt chunks and synthesize from scratch per query. Subtle questions spanning five docs mean re-piecing fragments every time. No accumulation happens; knowledge evaporates after each chat.
This pattern flips it: LLMs incrementally build a persistent wiki of markdown files between raw sources and queries. New sources trigger extraction, integration, and updates—flagging contradictions, strengthening syntheses, adding cross-links. The wiki compounds: entity pages evolve, topic summaries deepen, overviews reflect all ingested data.
"The wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you've read."
You curate sources and steer; LLM handles grunt work—summarizing, filing, bookkeeping. Pair with Obsidian: LLM edits in one pane, you browse graph/links in the other. Works for personal tracking (health, goals), research deep-dives, book companions (like Tolkien fan wikis), team intranets (Slack, transcripts), competitive intel.
Trade-off: Initial schema setup and supervision pay off as wiki scales. At 100 sources/ hundreds of pages, simple index suffices—no vector DB needed yet.
Three-Layer Stack Ensures Discipline
Raw sources: Immutable docs (articles, papers, images). LLM reads, never writes.
Wiki: LLM-owned markdown directory. Entity pages (people/events), concept pages, summaries, comparisons, syntheses. Updates touch 10-15 files per ingest.
Schema: Single MD file (e.g., CLAUDE.md) dictating structure, conventions, workflows. Co-evolve it with LLM. Defines page formats, ingest steps, query outputs. Without this, LLM chatters generically; with it, it's a disciplined maintainer.
"You're in charge of sourcing, exploration, and asking the right questions. The LLM does all the grunt work—the summarizing, cross-referencing, filing, and bookkeeping."
From comments: Refine schema with type-specific templates (person vs. event pages; 7 types max). Every task yields two outputs: direct answer + wiki updates. Classify sources first (report vs. transcript) for targeted extraction—saves tokens, boosts depth.
Ingest, Query, Lint: Core Workflows
Ingest: Drop source, prompt LLM per schema. Flow: Discuss takeaways, write summary page, update index/log, revise 10+ wiki pages (entities, concepts). Involve yourself for emphasis; or batch. One source ripples across wiki.
Query: LLM scans index, reads pages, answers with citations (table, Marp slides, charts). File answers back as new pages—e.g., your analysis becomes permanent asset.
Lint: Periodic health-check. LLM flags contradictions, stale claims, orphans, gaps. Suggests new sources/questions. Keeps wiki coherent as it grows.
"Good answers can be filed back into the wiki as new pages. ... This way your explorations compound in the knowledge base just like ingested sources do."
Comment extensions: Token budgets for progressive disclosure (L0: 200t project context; L1: 1-2K index; up to L3: 20K full docs). Human verifies high-stakes claims—LLM synthesizes uncited if unchecked.
Index, Log, and Scaling Tools
index.md: Content map—pages listed with summaries, categories, metadata. LLM reads it first for queries. Scales to hundreds of pages sans embeddings.
log.md: Append-only timeline ("## 2026-04-02 ingest | Article"). Grep for recency (e.g., last 5 entries).
At scale: Add qmd (local hybrid search: BM25/vector + LLM rerank; CLI for agents). Or vibe-code simple scripts. Git repo for versioning/branching.
Obsidian tips: Web Clipper for MD sources; hotkey-download images to raw/assets (LLM views separately); graph view for structure; Dataview for frontmatter queries (tags/dates); Marp plugin for slides.
Implementations in comments: Palinode (git-blame facts, JSON ops: KEEP/UPDATE); knowledge-engine (Memvid for fast machine search, synced to MD); Clawhub skill for conversational builds.
Why Maintenance-Free Knowledge Wins
Bookkeeping kills human wikis: cross-refs lag, contradictions fester, consistency crumbles. LLMs touch 15 files unflinchingly—cost near-zero.
"The tedious part of maintaining a knowledge base is not the reading or the thinking—it's the bookkeeping. ... Humans abandon wikis because the maintenance burden grows faster than the value. LLMs don't get bored."
Echoes Memex: Private, associative trails. LLM solves upkeep Bush couldn't. Domain-tag frontmatter early for cross-project graphs (shared entities shine).
Abstract by design—paste to your agent (Claude/Codex), collaborate on instantiation. No fixed dir/schema; adapt to needs (text-only? Skip images).
Key Takeaways
- Copy-paste this gist to your LLM agent; co-build schema/wiki for your domain (start with CLAUDE.md defining ingest/query/lint).
- Ingest one source at a time: Read LLM summary, guide updates—ripples build depth fast.
- Always output query results to wiki pages + direct answer; compounds explorations.
- Use index.md for navigation; grep log.md for timeline—scales without RAG infra.
- Lint weekly: Fix orphans/contradictions; human-spotcheck citations in high-stakes use.
- Obsidian setup: Enable attachment folder/hotkey; graph view reveals hubs/orphans.
- Classify sources by type pre-extract; use entity-specific templates (7 max).
- Git the wiki: Free versioning; add provenance (hashes/ops) for fact-tracking.