Self-Host Vane + Ollama for Private AI Web Research

Install Vane in Docker on Windows 11 with local Ollama and Qwen3.5:9b to run citation-backed searches privately, bypassing cloud services like OpenAI.

Why Vane Beats Cloud AI Search Tools

Vane, the privacy-focused successor to Perplexica, enables fully local online research by combining SearxNG for web searches with a local LLM to summarize results and generate answers. Every claim includes source citations, allowing verification without blind trust in the model. This setup avoids sending queries to cloud services like ChatGPT or Perplexity, ensuring data privacy. Vane itself runs without GPU needs; only the LLM requires it for efficient inference.

Hardware and Model Selection for Windows 11

On Windows 11 with Docker Desktop, pair Vane with Ollama running Qwen3.5:9b, which fits comfortably on an NVIDIA Quadro RTX A4500 (20GB VRAM) for large context windows. For GPUs with less memory, switch to smaller variants like qwen3.5:4b or qwen3.5:2b to maintain performance without offloading to cloud. This local stack delivers production-ready research without latency or privacy risks from external APIs.

Setup Outcomes and Trade-offs

Self-hosting Vane provides verifiable, private AI research: SearxNG fetches results privately, the LLM processes them into cited responses. Benefits include full control and no vendor lock-in, but requires Docker familiarity and sufficient GPU for the LLM. Smaller models trade context depth for broader hardware compatibility, ensuring accessibility for most developer setups.

Summarized by x-ai/grok-4.1-fast via openrouter

3957 input / 1466 output tokens in 26676ms

© 2026 Edge