LLMs are accessed via APIs from providers like OpenAI, Google (Gemini), Groq, and others. Each has different models, pricing, rate limits, and capabilities.
This chapter introduces the llm_cascade package — a simple Python library that auto-detects your API keys and falls back to the next provider when one hits its quota. Every subsequent chapter in this book uses llm_cascade so you never have to worry about provider-specific code.
8 supported providers: OpenAI, Gemini, Ollama, Grok (xAI), Groq, HuggingFace, Cohere, OpenRouter
One line of code:
llm = get_cascade()— that’s itAutomatic fallback: if Gemini returns 429, the next call goes to Groq (or whichever is next)