This notebook demonstrates a multi-agent system where four specialized LLM agents work together to turn an Excel spreadsheet into a queryable database and answer natural-language business questions in two different ways — then compare the results.
The four agents:
| Agent | Role | Output |
|---|---|---|
| Agent 1: Data Architect | Reads Excel sheets and proposes a logical database schema (tables, primary/foreign keys) | Bulleted schema description |
| Agent 2: SQL Developer | Converts the logical schema into SQL DDL, creates the SQLite DB, and populates it from the Excel data | SQL CREATE TABLE statements + populated DB |
| Agent 3: SQL Query Writer | Given a business question, writes a SQL SELECT query and executes it against the database | Structured SQL result rows |
| Agent 4: RAG Analyst | Given the same business question, answers it by reading the raw data directly (no SQL) | Natural-language answer |
| Agent 5: Evaluator | Compares the SQL result and RAG answer, returns a similarity score | Percentage score (0–100%) |
Why two approaches? SQL is precise but requires structured data and correct queries. RAG is flexible but may miss details or hallucinate. Comparing them on the same question shows the tradeoffs clearly.
Workflow:
Phase 1 (one-time setup): Upload an Excel file → Agents 1 and 2 build the database
Phase 2 (repeat as needed): Submit natural-language queries → Agents 3, 4, and 5 run on each query
You can run many queries in Phase 2 without redoing the Phase 1 setup.
# 1. INSTALL DEPENDENCIES
!pip install -q gradio openai pandas chromadb openpyxl langchain-community langchain-openai git+https://github.com/KarAnalytics/llm_cascade.git sentence-transformersThe cell above installs the full toolkit for the multi-agent system. Beyond the basics (pandas, sqlite3), we add chromadb for vector storage (used by the RAG agent), langchain-community and langchain-openai for embedding utilities, sentence-transformers for local embedding models, and llm_cascade for provider-agnostic LLM calls. The multi-agent version requires more libraries than the single-agent notebook because it runs two parallel answer pathways -- SQL and RAG -- plus an evaluator to compare them.
Multi-Agent System: SQL vs. RAG for Business Database Queries¶
Why would you use five agents when one could do the job? The answer lies in a principle borrowed from software engineering: separation of concerns. Each agent in this notebook has a narrow, well-defined role and a system prompt tailored to that role. The Data Architect thinks about table relationships and primary keys. The SQL Developer thinks about data types and CREATE TABLE syntax. The SQL Query Writer thinks about JOINs and WHERE clauses. The RAG Analyst ignores SQL entirely and reads the raw data. And the Evaluator acts as an impartial judge, comparing the two answers without knowing which approach produced which.
This specialization offers two concrete benefits. First, each agent’s system prompt can be short, focused, and easy to debug -- if the SQL is wrong, you know exactly which agent and which prompt to inspect. Second, by running two independent answer pathways (SQL and RAG) and comparing them, the system catches mistakes that either approach alone would miss. When the SQL answer says “total revenue is USD 1.2M” and the RAG answer says “USD 1.2M,” you have high confidence the number is correct. When they disagree, that disagreement itself is valuable information.
The five agents:
| Agent | Role | What it specializes in |
|---|---|---|
| Agent 1: Data Architect | Reads Excel sheets, proposes a logical schema | Understanding data structure, primary/foreign keys |
| Agent 2: SQL Developer | Converts the logical schema into DDL and creates the database | SQL syntax, data types, table creation |
| Agent 3: SQL Query Writer | Writes and executes SQL SELECT queries for business questions | Query optimization, JOINs, aggregations |
| Agent 4: RAG Analyst | Answers the same question by reading raw data (no SQL) | Natural language reasoning over tabular data |
| Agent 5: Evaluator | Compares SQL and RAG answers, returns a similarity score | LLM-as-a-judge pattern, cross-validation |
Workflow:
Phase 1 (one-time setup): Upload an Excel file -> Agents 1 and 2 build the database
Phase 2 (repeat as needed): Submit natural-language queries -> Agents 3, 4, and 5 run on each query
Imports and Setup¶
We use pandas to read Excel, sqlite3 for the database, chromadb (for future vector RAG), and llm_cascade for automatic LLM provider fallback across 8 free-tier providers.
import os
import sqlite3
import pandas as pd
import chromadb
from llm_cascade import get_cascade
llm = get_cascade()
# Global database config
DB_NAME = 'agent_database.db'
CHROMA_CLIENT = chromadb.Client()
LLM Cascade - available providers:
+ Gemini model=gemini-2.5-flash
+ Ollama model=kimi-k2.5:cloud
+ Groq model=llama-3.3-70b-versatile
+ HuggingFace model=meta-llama/Llama-3.3-70B-Instruct
+ Cohere model=command-a-03-2025
+ OpenRouter model=meta-llama/llama-3.3-70b-instruct:free
+ OpenAI model=gpt-4o-mini
Not configured (skipped):
- Grok (xAI) (set XAI_API_KEY)
Agent 1: Data Architect¶
Role: Read each Excel sheet, look at the column names and data types, and propose a logical database schema in plain English.
How it works:
Reads each sheet with pandas
Extracts column names and dtypes
Sends this metadata to the LLM with a prompt asking for a bulleted schema description (primary keys, foreign keys, relationships)
The LLM returns a human-readable schema — no SQL yet
The logical schema serves as the shared plan that Agent 2 (SQL Developer) will implement.
def agent_1_ingest(file_path):
'''Agent 1 -- Data Architect: reads Excel, proposes logical schema.'''
xl = pd.ExcelFile(file_path)
sheet_data = ''
data_map = {}
for sheet in xl.sheet_names:
df = pd.read_excel(file_path, sheet_name=sheet)
data_map[sheet] = df
cols = ', '.join([f'{col} ({dtype})' for col, dtype in zip(df.columns, df.dtypes)])
sheet_data += f'Sheet Name: {sheet}' + chr(10) + f'Columns detected: {cols}' + chr(10) + chr(10)
prompt = (
'Analyze these sheets and provide a logical database schema. '
'Use ONLY bullet points. Identify suggested Primary Keys and Foreign Keys:' + chr(10) + sheet_data
)
system_prompt = 'You are a Data Architect. Output a logical schema in bulleted text. Do NOT write SQL code.'
response = llm.generate(prompt, system_prompt=system_prompt)
return response.text, data_map
print('Agent 1 (Data Architect) ready.')
Agent 1 (Data Architect) ready.
Agent 2: SQL Developer¶
Role: Take the logical schema from Agent 1 and turn it into an actual SQLite database populated with the Excel data.
How it works:
Prompts the LLM to generate SQL
CREATE TABLEDDL statements from the logical schemaExecutes the DDL against a fresh SQLite database
Uses pandas
to_sql()to populate each table from the corresponding Excel sheetReturns the DDL text and a status message
After this agent runs, we have a queryable SQLite database ready for business questions.
def agent_2_sql_developer(logical_schema, data_map):
'''Agent 2 -- SQL Developer: generates DDL, creates DB, populates tables.'''
prompt = f'Based on this logical schema, write SQLite CREATE TABLE statements:' + chr(10) + logical_schema
system_prompt = 'Return ONLY raw SQL CREATE TABLE statements. No explanation.'
response = llm.generate(prompt, system_prompt=system_prompt)
ddl_resp = response.text.replace('```sql', '').replace('```', '')
# Execute DDL and insert data
if os.path.exists(DB_NAME):
os.remove(DB_NAME)
conn = sqlite3.connect(DB_NAME)
cursor = conn.cursor()
try:
cursor.executescript(ddl_resp)
for table, df in data_map.items():
clean_table_name = table.replace(' ', '_')
df.to_sql(clean_table_name, conn, if_exists='replace', index=False)
conn.commit()
status = 'Database Ready: Tables created and data inserted.'
except Exception as e:
status = f'Database Error: {str(e)}'
finally:
conn.close()
return ddl_resp, status
print('Agent 2 (SQL Developer) ready.')
Agent 2 (SQL Developer) ready.
Agent 3: SQL Query Writer¶
Role: Given a natural-language business question, write a SQL SELECT query and execute it against the database.
How it works:
Sends the schema + the user’s question to the LLM
Asks the LLM to return ONLY the SQL query (no explanation)
Strips any markdown code fences
Runs the query with pandas and returns the result as a formatted string
Strengths: Precise, verifiable, handles aggregations and joins well. Weaknesses: Fails if the LLM writes bad SQL or misunderstands the schema.
def agent_3_sql_query(question, logical_schema, conn):
'''Agent 3 -- SQL Query Writer: writes and executes a SQL SELECT for the question.'''
sql_gen_prompt = (
'Schema:' + chr(10) + logical_schema + chr(10)
+ 'Query: ' + question + chr(10)
+ 'Generate ONLY the SQL SELECT statement. Table names might have underscores.'
)
response = llm.generate(sql_gen_prompt)
sql_query = response.text.replace('```sql', '').replace('```', '').strip()
try:
result_df = pd.read_sql_query(sql_query, conn)
return sql_query, result_df.to_string()
except Exception as e:
return sql_query, f'SQL Execution Error: {str(e)}'
print('Agent 3 (SQL Query Writer) ready.')
Agent 3 (SQL Query Writer) ready.
Agent 4: RAG Analyst¶
Role: Answer the same business question using retrieval-augmented generation — no SQL, just raw data passed into the LLM prompt as context.
How it works:
Builds a context string from the first 20 rows of each table (as plain text)
Sends that context + the user’s question to the LLM
The LLM reads the raw data and answers in natural language
Strengths: Works without valid SQL, handles fuzzy or open-ended questions. Weaknesses: Limited by context window (only sees first 20 rows here), may hallucinate, less precise for numeric aggregations.
def agent_4_rag_analyst(question, data_map):
'''Agent 4 -- RAG Analyst: answers from raw data passed as context.'''
context = chr(10).join(
['Table ' + k + ':' + chr(10) + v.head(20).to_string() for k, v in data_map.items()]
)
rag_prompt = (
'Data Context:' + chr(10) + context + chr(10)
+ 'User Question: ' + question + chr(10)
+ 'Answer the question based on the data provided.'
)
response = llm.generate(rag_prompt)
return response.text
print('Agent 4 (RAG Analyst) ready.')
Agent 4 (RAG Analyst) ready.
Agent 5: Evaluator¶
Role: Compare the SQL answer (Agent 3) and the RAG answer (Agent 4) for the same question, and return a similarity score.
How it works:
Sends both answers to the LLM
Asks it to return ONLY a percentage score (0–100%)
High score = both approaches agree; low score = they diverge (worth investigating why)
This is a simple “LLM-as-a-judge” pattern — using an LLM to evaluate other LLM outputs.
def agent_5_evaluator(sql_result, rag_result):
'''Agent 5 -- Evaluator: compares SQL and RAG answers, returns a similarity score.'''
eval_prompt = (
'Compare these two answers and provide ONLY a Similarity Score (0-100%).' + chr(10)
+ 'Answer A: ' + sql_result + chr(10)
+ 'Answer B: ' + rag_result
)
system_prompt = 'You are a calculator. Output only a percentage score.'
response = llm.generate(eval_prompt, system_prompt=system_prompt)
return response.text
print('Agent 5 (Evaluator) ready.')
Agent 5 (Evaluator) ready.
Interactive App: Upload Excel and Ask Questions¶
The cell below launches a Gradio app that gives you an interactive UI for the multi-agent system:
Step 1: Upload your Excel file and click Build Database. Agents 1 and 2 run: the Data Architect designs a logical schema, and the SQL Developer writes the DDL and creates a SQLite database.
Step 2: Type a business question and click Submit Query (or press Enter). Agents 3, 4, and 5 run: the SQL Query Writer generates SQL, the RAG Analyst answers from raw data, and the Evaluator compares the two. Each Q&A is added to the chat history.
Repeat Step 2 as many times as you want — no need to re-upload the file.
If you do not have an Excel file on hand, download this sample dataset and upload it: SUPPLIER_PARTS.xlsx (click Download raw file on the GitHub page).
This is the classroom-friendly way to compare SQL vs. RAG live. Students can submit their own questions and watch the agents disagree (or agree) in real time.
import gradio as gr
# Module-level state so the Gradio UI can persist the schema across queries
_app_state = {'logical_schema': None, 'data_map': None, 'db_ready': False}
def ui_build_database(file):
'''Phase 1: run Agents 1 and 2 on the uploaded file.'''
if file is None:
return '', '', 'Please upload an Excel file first.'
try:
logical_schema, data_map = agent_1_ingest(file.name)
ddl, status = agent_2_sql_developer(logical_schema, data_map)
_app_state['logical_schema'] = logical_schema
_app_state['data_map'] = data_map
_app_state['db_ready'] = 'Ready' in status
return logical_schema, ddl, status
except Exception as e:
return '', '', f'Error building database: {e}'
def ui_ask_question(question, history):
'''Phase 2: run Agents 3, 4, 5 on a single question.'''
if not _app_state['db_ready']:
bot_msg = 'Please upload an Excel file and click Build Database first.'
return history + [[question, bot_msg]], ''
if not question or not question.strip():
return history, question
try:
conn = sqlite3.connect(DB_NAME)
try:
sql_query, sql_result = agent_3_sql_query(question, _app_state['logical_schema'], conn)
finally:
conn.close()
rag_result = agent_4_rag_analyst(question, _app_state['data_map'])
similarity = agent_5_evaluator(sql_result, rag_result)
response = (
'**Agent 3 (SQL):**' + chr(10) + '```sql' + chr(10) + sql_query + chr(10) + '```' + chr(10) + chr(10)
+ '**SQL Result:**' + chr(10) + '```' + chr(10) + sql_result + chr(10) + '```' + chr(10) + chr(10)
+ '**Agent 4 (RAG):** ' + rag_result + chr(10) + chr(10)
+ '**Agent 5 (Similarity):** ' + similarity
)
except Exception as e:
response = f'Error: {e}'
return history + [[question, response]], ''
with gr.Blocks(title='Multi-Agent DB') as demo:
gr.Markdown('# Multi-Agent: Excel to SQLite to SQL vs. RAG Comparison')
gr.Markdown('### Step 1: Upload Excel file and build the database (Agents 1 & 2)')
with gr.Row():
file_input = gr.File(label='Excel File', file_types=['.xlsx', '.xls'])
build_btn = gr.Button('Build Database', variant='primary')
with gr.Row():
with gr.Column():
out_schema = gr.Markdown(label='Logical Schema (Agent 1)')
with gr.Column():
out_ddl = gr.Code(label='SQL DDL (Agent 2)', language='sql')
out_status = gr.Textbox(label='Status', interactive=False)
build_btn.click(
ui_build_database,
inputs=[file_input],
outputs=[out_schema, out_ddl, out_status],
)
gr.Markdown('### Step 2: Ask business questions (Agents 3, 4 & 5 run per query)')
chatbot = gr.Chatbot(label='Q&A History', height=400)
query_input = gr.Textbox(
label='Your question',
placeholder='e.g., How many total sales? Who is the top employee?',
)
submit_btn = gr.Button('Submit Query', variant='primary')
submit_btn.click(
ui_ask_question,
inputs=[query_input, chatbot],
outputs=[chatbot, query_input],
)
query_input.submit(
ui_ask_question,
inputs=[query_input, chatbot],
outputs=[chatbot, query_input],
)
demo.launch(debug=False)
/tmp/ipykernel_1601/2530079974.py:77: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style dictionaries with 'role' and 'content' keys.
chatbot = gr.Chatbot(label='Q&A History', height=400)
/tmp/ipykernel_1601/2530079974.py:77: DeprecationWarning: The default value of 'allow_tags' in gr.Chatbot will be changed from False to True in Gradio 6.0. You will need to explicitly set allow_tags=False if you want to disable tags in your chatbot.
chatbot = gr.Chatbot(label='Q&A History', height=400)
It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).
Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://a196d879b4ffc4d788.gradio.live
This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
Key Takeaways¶
What we demonstrated:
A multi-agent system where each agent has a narrow, well-defined role
Agents pass outputs to each other (logical schema → DDL → populated DB → queries)
The same business question answered in two ways (SQL vs. RAG) with automatic comparison
Separation of setup phase (expensive, run once) from query phase (cheap, repeat many times)
When to use SQL vs. RAG on structured data:
| SQL (Agent 3) | RAG (Agent 4) | |
|---|---|---|
| Numeric aggregations | Excellent | Often wrong |
| Fuzzy/open-ended questions | Limited | Better |
| Joins across tables | Excellent | Struggles |
| Context size limit | None | Yes |
| Requires valid schema | Yes | No |
| Precision | High | Variable |
In practice, systems often use both: SQL for precise numeric queries, RAG for explanatory or fuzzy questions — and an LLM-as-a-judge (Agent 5) to flag when they disagree.
Exercises:
Add a new agent that generates a chart from the SQL result
Extend Agent 4 to use vector embeddings (ChromaDB) instead of raw context
Add an Agent 6 that explains WHY the two answers disagreed (when similarity is low)
Replace
llm_cascadewith a specific provider and see how different LLMs perform at each agent role
Run the code¶
To run this notebook, copy the URL below into your browser’s address bar. The link opens the notebook directly in Google Colab. (If your PDF viewer makes the URL clickable and lands on a broken page, copy the full text manually -- the viewer may have truncated the link at a line break.)
https://colab.research.google.com/github/KarAnalytics/code_demos/blob/main/MultiAgent_DB_multi_human_input.ipynb