Claude Code + LightRAG: Graph RAG for 500-2000+ Pages

Graph RAG Extracts Entities and Relationships for Deeper Insights

Naive RAG chunks documents into vectors via embedding models (e.g., OpenAI text-embedding-3-large), stores them in a vector DB, and retrieves closest matches to queries using cosine similarity—effective for small sets but fails on complex relations across documents. Graph RAG improves this by parallelly building a knowledge graph: entities (e.g., "Anthropic", "Claude Code") become nodes, relationships (e.g., "Anthropic created Claude Code") become edges with descriptive text. For 10 documents, this creates interconnected nodes traversable for queries like entity relations; scales to 500-1000+ documents for enterprises. LightRAG competes with Microsoft GraphRAG at a fraction of cost, enabling queries connecting disparate ideas (e.g., cost analysis across AI/RAG docs) with cited sources, entity types (organization/person), and chunk/file references.

One-Prompt Claude Code Setup with Docker and OpenAI

Clone LightRAG repo in Claude Code using this prompt: "Clone the LightRAG repo. Write the .env file configured for OpenAI with GPT-4o-mini and text-embedding-3-large. Use all default local storage and start it with Docker Compose." Requires Docker Desktop running and OpenAI API key. Claude Code automates: installs, configures .env, launches Docker container (visible in Docker Desktop), provides localhost:9621 UI link. UI supports PDF/text uploads (drag-drop; builds graph during embedding, may take time—reset via top-left button if stalled). Go fully local with Ollama for embeddings/QA or cloud-scale with Postgres/Neon. Free school community provides exact prompt and skills.

API Skills Turn LightRAG into Claude Code Commands

Bypass UI with four key API skills (query, upload, explore, status) for programmatic control: invoke "LightRAG query skill" in Claude Code (e.g., "What's the full cost picture of running RAG in 2026?") to POST to localhost APIs, get JSON responses with summaries, raw output, and references. Upload adds docs without duplicates (check status first); explore inspects entities/relations. Claude Code summarizes verbose responses automatically. Handles 500-2000 text pages (approaching 1M tokens) where agentic search (Claude's file search) hits limits—RAG is faster/cheaper at scale.

Use at 500-2000 Pages: 1000x Cheaper Than Pure LLM

Switch to Graph RAG at 500-2000 pages: beyond this, pure LLM contexts/agents cost 1,250x more and respond slower (July 2024 Gemini 2.0 study: textual RAG vs. LLM). LightRAG embedding is the bottleneck but low-cost; experiment easily since setup takes minutes. For non-text (tables/images), layer RagAnything (same makers) on top—multimodal extension covered in follow-up.