LLM-Powered Persistent Wikis Beat RAG
LLMs build and maintain a structured markdown wiki from raw sources, creating a compounding knowledge base with cross-references and syntheses that evolves incrementally, unlike RAG's per-query rediscovery.
Shift from Ephemeral RAG to Compounding Wiki Knowledge
Traditional RAG systems—like NotebookLM or ChatGPT file uploads—retrieve chunks from raw documents at query time, forcing the LLM to rediscover and synthesize knowledge repeatedly. This lacks accumulation: subtle queries spanning multiple docs require piecing fragments anew each time. The LLM-wiki pattern flips this by having the LLM construct a persistent, interlinked collection of markdown files acting as a living synthesis. New sources trigger updates to entity pages, topic summaries, and cross-references, flagging contradictions and strengthening connections. The result is a richer artifact where knowledge compounds—pre-built links and resolved tensions mean queries draw from an already-integrated base.
"Instead of just retrieving from raw documents at query time, the LLM incrementally builds and maintains a persistent wiki — a structured, interlinked collection of markdown files that sits between you and the raw sources."
This applies across domains: personal tracking (goals, health from journals/articles), research (papers building an evolving thesis), book reading (character/theme pages like fan wikis), business (Slack/meetings into team wiki), or hobbies (trip planning, competitive analysis). Humans source and steer; LLM handles synthesis.
Three-Layer Stack: Sources, Wiki, Schema
Raw sources form the immutable base—articles, papers, images, data. The LLM reads but never alters them.
The wiki layer is LLM-owned: markdown files for summaries, entities (e.g., people/concepts), comparisons, overviews. It evolves with each ingest, maintaining consistency via updates across 10-15 pages per source.
The schema (e.g., CLAUDE.md or AGENTS.md) is the configuration: defines structure, conventions (page formats, linking), and workflows. Co-evolve it with the LLM for your domain—e.g., entity pages with sections for attributes, relations, sources.
Use Obsidian as viewer: LLM edits in chat, you browse graph/links in real-time. Wiki as codebase, LLM as programmer, Obsidian as IDE.
Ingest, Query, and Lint Workflows
Ingest: Add source to raw dir, prompt LLM. It extracts takeaways (discuss with you), creates summary page, updates index/entities, logs entry. Involve yourself for guidance or batch for speed; one source ripples widely.
Query: LLM scans index for relevant pages, synthesizes with citations. Outputs vary—markdown, tables, Marp slides, charts. Crucial: file answers back as new wiki pages (e.g., comparisons, analyses) to compound explorations.
"Good answers can be filed back into the wiki as new pages. A comparison you asked for, an analysis, a connection you discovered — these are valuable and shouldn't disappear into chat history."
Lint: Periodic check for contradictions, stale info, orphans, gaps. LLM suggests new sources/questions, proposes fixes. Keeps wiki healthy/scalable.
Navigation: Index for Content, Log for History
index.md catalogs pages by category (entities/concepts/sources) with links, summaries, metadata (date/source count). LLM reads it first for queries; suffices for ~100 sources/hundreds pages, dodging embedding RAG needs.
log.md is append-only chronology: "## 2026-04-02 ingest | Article Title". Grep for recents (e.g., tail -5 entries). Tracks evolution.
Scale with CLI: qmd for hybrid search (BM25/vector + LLM rerank), CLI/server for LLM calls. Git for versioning/branching.
Implementation Tips and LLM Strengths
Clip sources via Obsidian Web Clipper; download images to raw/assets/ for LLM vision (read text first, then images). Graph view reveals structure; Dataview queries frontmatter (tags/dates/sources). Marp for slides.
"The tedious part of maintaining a knowledge base is not the reading or the thinking — it's the bookkeeping. Updating cross-references, keeping summaries current, noting when new data contradicts old claims, maintaining consistency across dozens of pages."
LLMs thrive here: tireless maintenance across files, zero boredom cost. Echoes Memex—curated trails, but LLM solves upkeep. Abstract by design: paste into LLM (Claude/Codex) to customize schema/dir/pages for your needs.
"The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else."
Key Takeaways
- Curate raw sources (articles/papers); never let LLM modify them—immutability ensures truth.
- Start with schema defining wiki structure/workflows; iterate via LLM collaboration.
- Ingest one-by-one interactively: review summaries/updates, guide emphasis.
- Always file query outputs back as wiki pages to compound value.
- Use index.md for navigation; grep log.md for history.
- Lint regularly: fix contradictions/orphans, pursue LLM-suggested gaps.
- Pair with Obsidian: clip sources, graph view, plugins (Dataview/Marp).
- Git the wiki for versioning; add qmd for search at scale.
- Focus human effort on sourcing/steering; offload all bookkeeping to LLM.