LiteLLM Unifies 70+ LLM Providers via OpenAI API

Call Any LLM Provider with One OpenAI-Compatible Interface

LiteLLM standardizes API calls across 70+ providers using OpenAI's chat completions format, so you swap models/providers by changing the model prefix (e.g., openai/gpt-4o, anthropic/claude-3-5-sonnet, groq/llama3-70b-8192). This eliminates per-provider SDKs—use litellm.completions() or litellm.acompletions() for text, vision, embeddings, image gen/editing. Supports all Anthropic models, AWS SageMaker Jumpstart, Bedrock (13 models), Fireworks AI (all models), Together AI (all), Ollama (all), OpenRouter (text/chat/vision/embeddings), Replicate (all), and more. For production, deploy LiteLLM Proxy as an LLM gateway with litellm --config to handle routing, auth, load balancing.

Add Pricing, Context Windows, or New Providers Easily

Update model pricing/context windows by PR to LiteLLM's pricing file—keeps cost tracking accurate (e.g., for budget alerts). For OpenAI-compatible providers (Hyperbolic, Nscale), add via single JSON edit in contributing docs. Full custom providers (non-OpenAI format) use Custom API Server setup. Quick integration: Register as model provider or use upstream routing for endpoints like Clarifai (Anthropic/OpenAI/Qwen/xAI/Gemini), DataRobot.

Categories for Specialized Use Cases

Enterprise/Cloud: Azure OpenAI (5), Azure AI (9), Vertex AI (11), Google AI Studio (5), Bedrock (13), OCI, WatsonX (2), Snowflake, SAP GenAI Hub.

High-Performance/Specialized: Groq, Cerebras, Fireworks AI, FriendliAI, DeepInfra, Deepseek, Anyscale, SambaNova, Nebius, Nscale (EU sovereign).

Local/Open-Source: Ollama, vLLM (2), Llamafile, LM Studio, Docker Model Runner, Xinference, Triton Inference Server, NanoGPT, Lemonade (AMD GPUs).

Multimodal/Other: Black Forest Labs (FLUX image gen/editing), Fal AI (Stable Diffusion/Imagen), ElevenLabs (voice), Deepgram (speech-to-text), Stability AI, Recraft, RunwayML (2), Jina AI (embeddings), Milvus (RAG vector store).

Gateways/Agents: OpenRouter, LiteLLM Proxy, Helicone, Vercel AI Gateway, LangGraph/Pydantic AI agents, Manus/RAGFlow. This coverage lets you prototype with local Ollama, scale to Groq, or route via OpenRouter without rewriting code—trade-off: some providers need specific prefixes/auth (e.g., API keys for Cohere, OAuth for ChatGPT Pro). Thin list page; dive into sub-docs for exact model lists/setup.