Self-Host Vane + Ollama for Private AI Web Research

Why Vane Beats Cloud AI Search Tools

Vane, the privacy-focused successor to Perplexica, enables fully local online research by combining SearxNG for web searches with a local LLM to summarize results and generate answers. Every claim includes source citations, allowing verification without blind trust in the model. This setup avoids sending queries to cloud services like ChatGPT or Perplexity, ensuring data privacy. Vane itself runs without GPU needs; only the LLM requires it for efficient inference.

Hardware and Model Selection for Windows 11

On Windows 11 with Docker Desktop, pair Vane with Ollama running Qwen3.5:9b, which fits comfortably on an NVIDIA Quadro RTX A4500 (20GB VRAM) for large context windows. For GPUs with less memory, switch to smaller variants like qwen3.5:4b or qwen3.5:2b to maintain performance without offloading to cloud. This local stack delivers production-ready research without latency or privacy risks from external APIs.

Setup Outcomes and Trade-offs

Self-hosting Vane provides verifiable, private AI research: SearxNG fetches results privately, the LLM processes them into cited responses. Benefits include full control and no vendor lock-in, but requires Docker familiarity and sufficient GPU for the LLM. Smaller models trade context depth for broader hardware compatibility, ensuring accessibility for most developer setups.