This notebook extends the base autonomous agent workflow by assigning a different LLM vendor to each phase of the reasoning process. Instead of routing all calls through a single shared cascade, each cognitive phase -- planning, execution, synthesis, and critique -- gets its own dedicated LLM provider.
The multi-vendor approach demonstrates an important principle: different LLMs have different strengths. A model that excels at strategic decomposition (planning) may not be the best at fast factual retrieval (execution) or nuanced summarization (synthesis). By pinning each phase to a specific vendor, you can optimize the overall workflow quality while also reducing dependence on any single provider’s availability or rate limits.
Default phase-to-vendor assignments¶
| Phase | Role | Default vendor | Rationale |
|---|---|---|---|
| Plan | Strategic planner | Gemini | Strong at structured decomposition |
| Execute | Business analyst | Groq | Fast, factual, low latency |
| Synthesize | Senior executive | Cohere | Strong summarisation / reasoning |
| Reflect | Skeptical critic | HuggingFace | Independent perspective |
You’re free to change these¶
Each phase is wired in the Imports and Setup cell below with a single line: pin('VendorName'). Three things you can do:
Swap any vendor for another. Change
pin('Gemini')topin('OpenAI'),pin('Groq'), or any other provider. The available vendor names print at startup so you can see what’s configured.Use the same vendor for multiple (or all) phases. Repeating vendors is fine -- e.g. pin every phase to the one provider whose API key you have set. The four-phase architecture still works; you’ve just chosen to run all of them through the same backend.
Add fallback per phase. Pass multiple names:
pin('Gemini', 'Groq')tries Gemini first and falls back to Groq if Gemini is unavailable. This is the production pattern -- a preferred vendor with a reliable backup.
Important: by default, pin('SingleVendor') has no fallback. If that one vendor is rate-limited or unavailable, the whole workflow stops with RuntimeError: All providers exhausted. The fix is to add a backup vendor to the pin() call.
If you have not yet seen the single-vendor version (autonomous_agent.ipynb), review that first -- the workflow logic is identical, and this notebook focuses on the multi-vendor routing layer.
# 1. INSTALL DEPENDENCIES
!pip install -q git+https://github.com/KarAnalytics/llm_cascade.gitIGNORE THE ABOVE ERRORS AND PROCEED — they are specific to Google Colab’s preinstalled packages and the code will still execute correctly.
Autonomous Agent: Business Idea Validator (Multi-Vendor)¶
This notebook extends the base autonomous workflow by assigning a different free-tier LLM vendor to each phase. Instead of one shared cascade instance, each agent phase has its own LLMCascade instance pinned to a specific provider — demonstrating that autonomous workflows can mix and match models from different vendors within the same run.
Phase → Vendor mapping (defaults):
| Phase | Role | Default vendor | Why |
|---|---|---|---|
| Plan | Strategic planner | Gemini | Strong at structured decomposition |
| Execute | Business analyst | Groq | Fast, factual, low latency |
| Synthesize | Senior executive | Cohere | Strong summarisation / reasoning |
| Reflect | Skeptical critic | HuggingFace | Independent perspective |
Each vendor is loaded from your existing API keys — no extra setup required beyond what llm_cascade already supports.
How is this different from AutonomousAgent_BusinessValidator.ipynb?¶
| Single-vendor autonomous | Multi-vendor autonomous (this) | |
|---|---|---|
| LLM instances | 1 shared cascade | 4 separate per-phase instances |
| Provider per phase | Whichever responds first | Pinned to a specific vendor |
| Fallback | Yes — across all configured keys | Optional — can keep cascade fallback per phase |
| Teaching point | Autonomous workflow basics | Different LLMs can specialise in different cognitive tasks |
Imports and Setup¶
Each phase gets its own LLMCascade instance pinned to a specific provider. The pin() helper accepts one or more provider names; if you pass multiple names the phase still gets cascade-style fallback but only across those named vendors.
Edit the vendor names passed to pin() to swap models — all available names are listed by get_cascade() at startup.
from llm_cascade import get_cascade
from llm_cascade.providers import LLMCascade, PROVIDERS
# Show all configured providers so students know what's available
print('All configured providers:')
_ = get_cascade()
def pin(*provider_names):
"""Return an LLMCascade restricted to the named provider(s).
Falls back across names in the order given, just like the full cascade.
If a named provider has no API key it is silently skipped."""
selected = [p for p in PROVIDERS if p['name'] in provider_names]
if not selected:
raise ValueError(f'None of {provider_names} found in PROVIDERS. '
f'Available: {[p["name"] for p in PROVIDERS]}')
return LLMCascade(providers=selected, verbose=True)
# --- Assign one vendor per phase ---
# Change any of these to any provider name shown above.
# You can also pass multiple names for per-phase fallback, e.g. pin('Gemini', 'Groq')
planner_llm = pin('Gemini') # strategic decomposition
executor_llm = pin('Groq') # fast, factual analyst
synthesizer_llm = pin('Cohere') # summarisation & recommendation
reflector_llm = pin('HuggingFace') # independent skeptic
print()
print('Phase → vendor assignments:')
for label, instance in [('Planner', planner_llm), ('Executor', executor_llm),
('Synthesizer', synthesizer_llm), ('Reflector', reflector_llm)]:
names = ', '.join(p['name'] for p in instance.available) or 'NONE (key missing!)'
print(f' {label:<12} -> {names}')The Four Phases of the Autonomous Workflow¶
The logic is identical to the single-vendor version. The only change is that each function calls its own dedicated LLM instance instead of the shared llm object.
Plan —
planner_llm(Gemini) generates 4–5 research questionsExecute —
executor_llm(Groq) answers each question, accumulating contextSynthesize —
synthesizer_llm(Cohere) combines findings into a recommendationReflect —
reflector_llm(HuggingFace) critiques the recommendation and scores it
# Phase 1: Planner -- decides WHAT needs to be researched
PLANNER_SYSTEM = (
'You are a strategic business planner. Given a high-level goal, break it down '
'into the 4-5 most critical questions that must be answered to achieve it. '
'Output ONLY a numbered list of questions. No preamble, no conclusion.'
)
def plan(goal):
'''Phase 1: generate a research plan as a list of questions (uses planner_llm).'''
prompt = f'Goal: {goal}' + chr(10) + chr(10) + 'List the 4-5 most critical questions to answer.'
response = planner_llm.generate(prompt, system_prompt=PLANNER_SYSTEM)
# Parse the numbered list into a Python list of strings
steps = []
for line in response.text.split(chr(10)):
line = line.strip()
if not line:
continue
# Strip leading '1.', '1)', '-', '*', etc.
cleaned = line.lstrip('0123456789.)- *').strip()
if cleaned and len(cleaned) > 5:
steps.append(cleaned)
return steps
print('Phase 1 (Planner) ready.')# Phase 2: Executor -- answers ONE research question at a time
EXECUTOR_SYSTEM = (
'You are a business analyst. Answer the question concisely (2-4 sentences) '
'with concrete reasoning. Use the prior context if helpful, but stay focused '
'on the current question.'
)
def execute_step(question, prior_context):
'''Phase 2: answer one research question with access to prior answers (uses executor_llm).'''
prompt = 'Prior context:' + chr(10) + prior_context + chr(10) + chr(10) + 'Question: ' + question
response = executor_llm.generate(prompt, system_prompt=EXECUTOR_SYSTEM)
return response.text
print('Phase 2 (Executor) ready.')# Phase 3: Synthesizer -- combines findings into a recommendation
SYNTHESIZER_SYSTEM = (
'You are a senior executive. Given research findings, synthesize them into a '
'clear, decisive recommendation. State the recommendation (go / no-go / conditional), '
'followed by 3 bullet points of key reasoning.'
)
def synthesize(goal, research_results):
'''Phase 3: combine all research answers into a final recommendation (uses synthesizer_llm).'''
findings = chr(10).join([f'Q: {q}' + chr(10) + f'A: {a}' for q, a in research_results])
prompt = (
'Goal: ' + goal + chr(10) + chr(10)
+ 'Research findings:' + chr(10) + findings + chr(10) + chr(10)
+ 'Synthesize into a final recommendation.'
)
response = synthesizer_llm.generate(prompt, system_prompt=SYNTHESIZER_SYSTEM)
return response.text
print('Phase 3 (Synthesizer) ready.')# Phase 4: Critic -- self-reflects on the recommendation
CRITIC_SYSTEM = (
'You are a skeptical business critic. Given a recommendation, identify '
'what is weak or missing, what is strong, and assign a confidence level '
'from 1 (very weak) to 10 (very strong). Be concise.'
)
def reflect(recommendation):
'''Phase 4: critique the recommendation and assign confidence (uses reflector_llm).'''
prompt = 'Recommendation to critique:' + chr(10) + recommendation
response = reflector_llm.generate(prompt, system_prompt=CRITIC_SYSTEM)
return response.text
print('Phase 4 (Critic) ready.')The Autonomous Loop¶
Same logic as the single-vendor notebook, but now each phase label also shows which vendor responded. Watch the [Response from ...] lines to see the four different LLMs handing off to each other.
Total LLM calls per run: 1 (plan) + 4–5 (execute) + 1 (synthesize) + 1 (reflect) = 7–8 calls, each routed to a different provider.
def run_autonomous_workflow(goal, verbose=True):
'''The full autonomous workflow: Plan -> Execute -> Synthesize -> Reflect.
Each phase uses a different LLM vendor.'''
if verbose:
print('=' * 70)
print('GOAL:', goal)
print('=' * 70)
# --- Phase 1: Plan (planner_llm) ---
if verbose:
print(chr(10) + '[PHASE 1 - PLANNING | vendor: ' +
', '.join(p['name'] for p in planner_llm.available) + ']')
steps = plan(goal)
if verbose:
print(f'Agent decided to research {len(steps)} questions:')
for i, s in enumerate(steps, 1):
print(f' {i}. {s}')
# --- Phase 2: Execute each step (executor_llm) ---
if verbose:
print(chr(10) + '[PHASE 2 - EXECUTION | vendor: ' +
', '.join(p['name'] for p in executor_llm.available) + ']')
results = []
prior_context = ''
for i, step in enumerate(steps, 1):
if verbose:
print(chr(10) + f'Step {i}: {step}')
answer = execute_step(step, prior_context)
results.append((step, answer))
prior_context += chr(10) + f'Q: {step}' + chr(10) + f'A: {answer}'
if verbose:
print(f' -> {answer}')
# --- Phase 3: Synthesize (synthesizer_llm) ---
if verbose:
print(chr(10) + '[PHASE 3 - SYNTHESIS | vendor: ' +
', '.join(p['name'] for p in synthesizer_llm.available) + ']')
recommendation = synthesize(goal, results)
if verbose:
print(recommendation)
# --- Phase 4: Reflect (reflector_llm) ---
if verbose:
print(chr(10) + '[PHASE 4 - SELF-CRITIQUE | vendor: ' +
', '.join(p['name'] for p in reflector_llm.available) + ']')
critique = reflect(recommendation)
if verbose:
print(critique)
return {
'goal': goal,
'plan': steps,
'research': results,
'recommendation': recommendation,
'critique': critique,
}
print('Autonomous multi-vendor workflow function ready.')Run the Agent¶
Edit business_idea and run the cell. Watch the vendor tag on each phase header to confirm that different LLMs are handling different stages of the reasoning chain.
business_idea = 'An AI-powered app that helps college students find affordable off-campus housing by predicting rent trends.'
# business_idea = 'A subscription meal kit service for busy professionals that uses local ingredients.'
# business_idea = 'A mobile game that teaches kids basic accounting through story-based puzzles.'
result = run_autonomous_workflow(business_idea)Inspect the Full Trace¶
The result dict holds everything the agent produced. Each answer in result['research'] was generated by executor_llm; the recommendation by synthesizer_llm; and the critique by reflector_llm — three different LLMs contributing to one coherent report.
print('=' * 70)
print('FULL TRACE')
print('=' * 70)
print(chr(10) + 'GOAL:', result['goal'])
print(chr(10) + 'PLAN (what the agent decided to research):')
for i, step in enumerate(result['plan'], 1):
print(f' {i}. {step}')
print(chr(10) + 'FINAL RECOMMENDATION:')
print(result['recommendation'])
print(chr(10) + 'SELF-CRITIQUE:')
print(result['critique'])Key Takeaways¶
What’s new vs. the single-vendor version:
Each phase is served by a different LLM from a different vendor — they communicate by passing text, not by sharing state
The
pin()helper lets you assign any provider fromPROVIDERSto any phase in one lineYou can pass multiple names to
pin()for per-phase fallback:pin('Gemini', 'Groq')tries Gemini first, then Groq — useful if one key is missingThe vendor name appears in each phase header so you can verify the routing in the output
Why use different vendors per phase?
Some models excel at decomposition (planning), others at fast factual retrieval (execution), others at synthesis or critical evaluation
It reduces dependence on a single provider’s quota or availability
It’s the same principle behind multi-agent frameworks like CrewAI or AutoGen, but without the orchestration overhead
Exercises:
Swap the vendor assignments and compare the quality of the output — does a different planner generate better research questions?
Pass two vendors to
pin()(e.g.,pin('Cohere', 'Groq')) and pull the API key for the first one to test per-phase fallbackAdd a fifth phase (e.g., a marketing writer) using whichever vendor has remaining quota
Modify
run_autonomous_workflowso that if the critic’s confidence score is below 7, the critique is fed back toplanner_llmfor a revised plan
Run the code¶
To run this notebook, copy the URL below into your browser’s address bar. The link opens the notebook directly in Google Colab. (If your PDF viewer makes the URL clickable and lands on a broken page, copy the full text manually -- the viewer may have truncated the link at a line break.)
Estimated run time: ~3 minutes
https://colab.research.google.com/github/KarAnalytics/code_demos/blob/main/AutonomousAgent_multivendor.ipynb