Multi-Agent System: SQL vs. RAG for Business Database Queries

This notebook demonstrates a multi-agent system where four specialized LLM agents work together to turn an Excel spreadsheet into a queryable database and answer natural-language business questions in two different ways — then compare the results.

The four agents:

Agent	Role	Output
Agent 1: Data Architect	Reads Excel sheets and proposes a logical database schema (tables, primary/foreign keys)	Bulleted schema description
Agent 2: SQL Developer	Converts the logical schema into SQL DDL, creates the SQLite DB, and populates it from the Excel data	SQL `CREATE TABLE` statements + populated DB
Agent 3: SQL Query Writer	Given a business question, writes a SQL `SELECT` query and executes it against the database	Structured SQL result rows
Agent 4: RAG Analyst	Given the same business question, answers it by reading the raw data directly (no SQL)	Natural-language answer
Agent 5: Evaluator	Compares the SQL result and RAG answer, returns a similarity score	Percentage score (0–100%)

Why two approaches? SQL is precise but requires structured data and correct queries. RAG is flexible but may miss details or hallucinate. Comparing them on the same question shows the tradeoffs clearly.

Workflow:

Phase 1 (one-time setup): Upload an Excel file → Agents 1 and 2 build the database
Phase 2 (repeat as needed): Submit natural-language queries → Agents 3, 4, and 5 run on each query

You can run many queries in Phase 2 without redoing the Phase 1 setup.

# 1. INSTALL DEPENDENCIES
!pip install -q gradio openai pandas chromadb openpyxl langchain-community langchain-openai git+https://github.com/KarAnalytics/llm_cascade.git sentence-transformers

  Preparing metadata (setup.py) ... done
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 52.0/52.0 kB 2.3 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.2/23.2 MB 86.9 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 114.6 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 98.6/98.6 kB 11.3 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 278.2/278.2 kB 28.8 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 104.0 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 74.1 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 542.4/542.4 kB 49.7 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.0/18.0 MB 97.5 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.1/72.1 kB 7.9 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 180.2/180.2 kB 19.6 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 69.0/69.0 kB 7.5 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 231.6/231.6 kB 25.3 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 71.6/71.6 kB 7.8 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.6/60.6 kB 7.0 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.9/64.9 kB 7.5 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 51.0/51.0 kB 5.6 MB/s eta 0:00:00
  Building wheel for llm_cascade (setup.py) ... done
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests==2.32.4, but you have requests 2.33.1 which is incompatible.
opentelemetry-exporter-otlp-proto-http 1.38.0 requires opentelemetry-exporter-otlp-proto-common==1.38.0, but you have opentelemetry-exporter-otlp-proto-common 1.41.1 which is incompatible.
opentelemetry-exporter-otlp-proto-http 1.38.0 requires opentelemetry-proto==1.38.0, but you have opentelemetry-proto 1.41.1 which is incompatible.
opentelemetry-exporter-otlp-proto-http 1.38.0 requires opentelemetry-sdk~=1.38.0, but you have opentelemetry-sdk 1.41.1 which is incompatible.
opentelemetry-exporter-gcp-logging 1.11.0a0 requires opentelemetry-sdk<1.39.0,>=1.35.0, but you have opentelemetry-sdk 1.41.1 which is incompatible.
google-adk 1.29.0 requires opentelemetry-api<1.39.0,>=1.36.0, but you have opentelemetry-api 1.41.1 which is incompatible.
google-adk 1.29.0 requires opentelemetry-sdk<1.39.0,>=1.36.0, but you have opentelemetry-sdk 1.41.1 which is incompatible.

The cell above installs the full toolkit for the multi-agent system. Beyond the basics (pandas, sqlite3), we add chromadb for vector storage (used by the RAG agent), langchain-community and langchain-openai for embedding utilities, sentence-transformers for local embedding models, and llm_cascade for provider-agnostic LLM calls. The multi-agent version requires more libraries than the single-agent notebook because it runs two parallel answer pathways -- SQL and RAG -- plus an evaluator to compare them.

Multi-Agent System: SQL vs. RAG for Business Database Queries¶

Why would you use five agents when one could do the job? The answer lies in a principle borrowed from software engineering: separation of concerns. Each agent in this notebook has a narrow, well-defined role and a system prompt tailored to that role. The Data Architect thinks about table relationships and primary keys. The SQL Developer thinks about data types and CREATE TABLE syntax. The SQL Query Writer thinks about JOINs and WHERE clauses. The RAG Analyst ignores SQL entirely and reads the raw data. And the Evaluator acts as an impartial judge, comparing the two answers without knowing which approach produced which.

This specialization offers two concrete benefits. First, each agent’s system prompt can be short, focused, and easy to debug -- if the SQL is wrong, you know exactly which agent and which prompt to inspect. Second, by running two independent answer pathways (SQL and RAG) and comparing them, the system catches mistakes that either approach alone would miss. When the SQL answer says “total revenue is USD 1.2M” and the RAG answer says “USD 1.2M,” you have high confidence the number is correct. When they disagree, that disagreement itself is valuable information.

The five agents:

Agent	Role	What it specializes in
Agent 1: Data Architect	Reads Excel sheets, proposes a logical schema	Understanding data structure, primary/foreign keys
Agent 2: SQL Developer	Converts the logical schema into DDL and creates the database	SQL syntax, data types, table creation
Agent 3: SQL Query Writer	Writes and executes SQL SELECT queries for business questions	Query optimization, JOINs, aggregations
Agent 4: RAG Analyst	Answers the same question by reading raw data (no SQL)	Natural language reasoning over tabular data
Agent 5: Evaluator	Compares SQL and RAG answers, returns a similarity score	LLM-as-a-judge pattern, cross-validation

Workflow:

Phase 1 (one-time setup): Upload an Excel file -> Agents 1 and 2 build the database
Phase 2 (repeat as needed): Submit natural-language queries -> Agents 3, 4, and 5 run on each query

Imports and Setup¶

We use pandas to read Excel, sqlite3 for the database, chromadb (for future vector RAG), and llm_cascade for automatic LLM provider fallback across 8 free-tier providers.

import os
import sqlite3
import pandas as pd
import chromadb
from llm_cascade import get_cascade

llm = get_cascade()

# Global database config
DB_NAME = 'agent_database.db'
CHROMA_CLIENT = chromadb.Client()

LLM Cascade - available providers:
  + Gemini           model=gemini-2.5-flash
  + Ollama           model=kimi-k2.5:cloud
  + Groq             model=llama-3.3-70b-versatile
  + HuggingFace      model=meta-llama/Llama-3.3-70B-Instruct
  + Cohere           model=command-a-03-2025
  + OpenRouter       model=meta-llama/llama-3.3-70b-instruct:free
  + OpenAI           model=gpt-4o-mini
Not configured (skipped):
  - Grok (xAI)       (set XAI_API_KEY)

Agent 1: Data Architect¶

Role: Read each Excel sheet, look at the column names and data types, and propose a logical database schema in plain English.

How it works:

Reads each sheet with pandas
Extracts column names and dtypes
Sends this metadata to the LLM with a prompt asking for a bulleted schema description (primary keys, foreign keys, relationships)
The LLM returns a human-readable schema — no SQL yet

The logical schema serves as the shared plan that Agent 2 (SQL Developer) will implement.

def agent_1_ingest(file_path):
    '''Agent 1 -- Data Architect: reads Excel, proposes logical schema.'''
    xl = pd.ExcelFile(file_path)
    sheet_data = ''
    data_map = {}

    for sheet in xl.sheet_names:
        df = pd.read_excel(file_path, sheet_name=sheet)
        data_map[sheet] = df
        cols = ', '.join([f'{col} ({dtype})' for col, dtype in zip(df.columns, df.dtypes)])
        sheet_data += f'Sheet Name: {sheet}' + chr(10) + f'Columns detected: {cols}' + chr(10) + chr(10)

    prompt = (
        'Analyze these sheets and provide a logical database schema. '
        'Use ONLY bullet points. Identify suggested Primary Keys and Foreign Keys:' + chr(10) + sheet_data
    )
    system_prompt = 'You are a Data Architect. Output a logical schema in bulleted text. Do NOT write SQL code.'
    response = llm.generate(prompt, system_prompt=system_prompt)
    return response.text, data_map


print('Agent 1 (Data Architect) ready.')

Agent 1 (Data Architect) ready.

Agent 2: SQL Developer¶

Role: Take the logical schema from Agent 1 and turn it into an actual SQLite database populated with the Excel data.

How it works:

Prompts the LLM to generate SQL CREATE TABLE DDL statements from the logical schema
Executes the DDL against a fresh SQLite database
Uses pandas to_sql() to populate each table from the corresponding Excel sheet
Returns the DDL text and a status message

After this agent runs, we have a queryable SQLite database ready for business questions.

def agent_2_sql_developer(logical_schema, data_map):
    '''Agent 2 -- SQL Developer: generates DDL, creates DB, populates tables.'''
    prompt = f'Based on this logical schema, write SQLite CREATE TABLE statements:' + chr(10) + logical_schema
    system_prompt = 'Return ONLY raw SQL CREATE TABLE statements. No explanation.'
    response = llm.generate(prompt, system_prompt=system_prompt)
    ddl_resp = response.text.replace('```sql', '').replace('```', '')

    # Execute DDL and insert data
    if os.path.exists(DB_NAME):
        os.remove(DB_NAME)
    conn = sqlite3.connect(DB_NAME)
    cursor = conn.cursor()
    try:
        cursor.executescript(ddl_resp)
        for table, df in data_map.items():
            clean_table_name = table.replace(' ', '_')
            df.to_sql(clean_table_name, conn, if_exists='replace', index=False)
        conn.commit()
        status = 'Database Ready: Tables created and data inserted.'
    except Exception as e:
        status = f'Database Error: {str(e)}'
    finally:
        conn.close()

    return ddl_resp, status


print('Agent 2 (SQL Developer) ready.')

Agent 2 (SQL Developer) ready.

Agent 3: SQL Query Writer¶

Role: Given a natural-language business question, write a SQL SELECT query and execute it against the database.

How it works:

Sends the schema + the user’s question to the LLM
Asks the LLM to return ONLY the SQL query (no explanation)
Strips any markdown code fences
Runs the query with pandas and returns the result as a formatted string

Strengths: Precise, verifiable, handles aggregations and joins well. Weaknesses: Fails if the LLM writes bad SQL or misunderstands the schema.

def agent_3_sql_query(question, logical_schema, conn):
    '''Agent 3 -- SQL Query Writer: writes and executes a SQL SELECT for the question.'''
    sql_gen_prompt = (
        'Schema:' + chr(10) + logical_schema + chr(10)
        + 'Query: ' + question + chr(10)
        + 'Generate ONLY the SQL SELECT statement. Table names might have underscores.'
    )
    response = llm.generate(sql_gen_prompt)
    sql_query = response.text.replace('```sql', '').replace('```', '').strip()

    try:
        result_df = pd.read_sql_query(sql_query, conn)
        return sql_query, result_df.to_string()
    except Exception as e:
        return sql_query, f'SQL Execution Error: {str(e)}'


print('Agent 3 (SQL Query Writer) ready.')

Agent 3 (SQL Query Writer) ready.

Agent 4: RAG Analyst¶

Role: Answer the same business question using retrieval-augmented generation — no SQL, just raw data passed into the LLM prompt as context.

How it works:

Builds a context string from the first 20 rows of each table (as plain text)
Sends that context + the user’s question to the LLM
The LLM reads the raw data and answers in natural language

Strengths: Works without valid SQL, handles fuzzy or open-ended questions. Weaknesses: Limited by context window (only sees first 20 rows here), may hallucinate, less precise for numeric aggregations.

def agent_4_rag_analyst(question, data_map):
    '''Agent 4 -- RAG Analyst: answers from raw data passed as context.'''
    context = chr(10).join(
        ['Table ' + k + ':' + chr(10) + v.head(20).to_string() for k, v in data_map.items()]
    )
    rag_prompt = (
        'Data Context:' + chr(10) + context + chr(10)
        + 'User Question: ' + question + chr(10)
        + 'Answer the question based on the data provided.'
    )
    response = llm.generate(rag_prompt)
    return response.text


print('Agent 4 (RAG Analyst) ready.')

Agent 4 (RAG Analyst) ready.

Agent 5: Evaluator¶

Role: Compare the SQL answer (Agent 3) and the RAG answer (Agent 4) for the same question, and return a similarity score.

How it works:

Sends both answers to the LLM
Asks it to return ONLY a percentage score (0–100%)
High score = both approaches agree; low score = they diverge (worth investigating why)

This is a simple “LLM-as-a-judge” pattern — using an LLM to evaluate other LLM outputs.

def agent_5_evaluator(sql_result, rag_result):
    '''Agent 5 -- Evaluator: compares SQL and RAG answers, returns a similarity score.'''
    eval_prompt = (
        'Compare these two answers and provide ONLY a Similarity Score (0-100%).' + chr(10)
        + 'Answer A: ' + sql_result + chr(10)
        + 'Answer B: ' + rag_result
    )
    system_prompt = 'You are a calculator. Output only a percentage score.'
    response = llm.generate(eval_prompt, system_prompt=system_prompt)
    return response.text


print('Agent 5 (Evaluator) ready.')

Agent 5 (Evaluator) ready.

Interactive App: Upload Excel and Ask Questions¶

The cell below launches a Gradio app that gives you an interactive UI for the multi-agent system:

Step 1: Upload your Excel file and click Build Database. Agents 1 and 2 run: the Data Architect designs a logical schema, and the SQL Developer writes the DDL and creates a SQLite database.
Step 2: Type a business question and click Submit Query (or press Enter). Agents 3, 4, and 5 run: the SQL Query Writer generates SQL, the RAG Analyst answers from raw data, and the Evaluator compares the two. Each Q&A is added to the chat history.
Repeat Step 2 as many times as you want — no need to re-upload the file.

If you do not have an Excel file on hand, download this sample dataset and upload it: SUPPLIER_PARTS.xlsx (click Download raw file on the GitHub page).

This is the classroom-friendly way to compare SQL vs. RAG live. Students can submit their own questions and watch the agents disagree (or agree) in real time.

import gradio as gr

# Module-level state so the Gradio UI can persist the schema across queries
_app_state = {'logical_schema': None, 'data_map': None, 'db_ready': False}


def ui_build_database(file):
    '''Phase 1: run Agents 1 and 2 on the uploaded file.'''
    if file is None:
        return '', '', 'Please upload an Excel file first.'

    try:
        logical_schema, data_map = agent_1_ingest(file.name)
        ddl, status = agent_2_sql_developer(logical_schema, data_map)

        _app_state['logical_schema'] = logical_schema
        _app_state['data_map'] = data_map
        _app_state['db_ready'] = 'Ready' in status
        return logical_schema, ddl, status
    except Exception as e:
        return '', '', f'Error building database: {e}'


def ui_ask_question(question, history):
    '''Phase 2: run Agents 3, 4, 5 on a single question.'''
    if not _app_state['db_ready']:
        bot_msg = 'Please upload an Excel file and click Build Database first.'
        return history + [[question, bot_msg]], ''

    if not question or not question.strip():
        return history, question

    try:
        conn = sqlite3.connect(DB_NAME)
        try:
            sql_query, sql_result = agent_3_sql_query(question, _app_state['logical_schema'], conn)
        finally:
            conn.close()

        rag_result = agent_4_rag_analyst(question, _app_state['data_map'])
        similarity = agent_5_evaluator(sql_result, rag_result)

        response = (
            '**Agent 3 (SQL):**' + chr(10) + '```sql' + chr(10) + sql_query + chr(10) + '```' + chr(10) + chr(10)
            + '**SQL Result:**' + chr(10) + '```' + chr(10) + sql_result + chr(10) + '```' + chr(10) + chr(10)
            + '**Agent 4 (RAG):** ' + rag_result + chr(10) + chr(10)
            + '**Agent 5 (Similarity):** ' + similarity
        )
    except Exception as e:
        response = f'Error: {e}'

    return history + [[question, response]], ''


with gr.Blocks(title='Multi-Agent DB') as demo:
    gr.Markdown('# Multi-Agent: Excel to SQLite to SQL vs. RAG Comparison')

    gr.Markdown('### Step 1: Upload Excel file and build the database (Agents 1 & 2)')
    with gr.Row():
        file_input = gr.File(label='Excel File', file_types=['.xlsx', '.xls'])
        build_btn = gr.Button('Build Database', variant='primary')

    with gr.Row():
        with gr.Column():
            out_schema = gr.Markdown(label='Logical Schema (Agent 1)')
        with gr.Column():
            out_ddl = gr.Code(label='SQL DDL (Agent 2)', language='sql')
    out_status = gr.Textbox(label='Status', interactive=False)

    build_btn.click(
        ui_build_database,
        inputs=[file_input],
        outputs=[out_schema, out_ddl, out_status],
    )

    gr.Markdown('### Step 2: Ask business questions (Agents 3, 4 & 5 run per query)')
    chatbot = gr.Chatbot(label='Q&A History', height=400)
    query_input = gr.Textbox(
        label='Your question',
        placeholder='e.g., How many total sales? Who is the top employee?',
    )
    submit_btn = gr.Button('Submit Query', variant='primary')

    submit_btn.click(
        ui_ask_question,
        inputs=[query_input, chatbot],
        outputs=[chatbot, query_input],
    )
    query_input.submit(
        ui_ask_question,
        inputs=[query_input, chatbot],
        outputs=[chatbot, query_input],
    )

demo.launch(debug=False)

/tmp/ipykernel_1601/2530079974.py:77: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style dictionaries with 'role' and 'content' keys.
  chatbot = gr.Chatbot(label='Q&A History', height=400)
/tmp/ipykernel_1601/2530079974.py:77: DeprecationWarning: The default value of 'allow_tags' in gr.Chatbot will be changed from False to True in Gradio 6.0. You will need to explicitly set allow_tags=False if you want to disable tags in your chatbot.
  chatbot = gr.Chatbot(label='Q&A History', height=400)

It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://a196d879b4ffc4d788.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)

Key Takeaways¶

What we demonstrated:

A multi-agent system where each agent has a narrow, well-defined role
Agents pass outputs to each other (logical schema → DDL → populated DB → queries)
The same business question answered in two ways (SQL vs. RAG) with automatic comparison
Separation of setup phase (expensive, run once) from query phase (cheap, repeat many times)

When to use SQL vs. RAG on structured data:

	SQL (Agent 3)	RAG (Agent 4)
Numeric aggregations	Excellent	Often wrong
Fuzzy/open-ended questions	Limited	Better
Joins across tables	Excellent	Struggles
Context size limit	None	Yes
Requires valid schema	Yes	No
Precision	High	Variable

In practice, systems often use both: SQL for precise numeric queries, RAG for explanatory or fuzzy questions — and an LLM-as-a-judge (Agent 5) to flag when they disagree.

Exercises:

Add a new agent that generates a chart from the SQL result
Extend Agent 4 to use vector embeddings (ChromaDB) instead of raw context
Add an Agent 6 that explains WHY the two answers disagreed (when similarity is low)
Replace llm_cascade with a specific provider and see how different LLMs perform at each agent role

Run the code¶

To run this notebook, copy the URL below into your browser’s address bar. The link opens the notebook directly in Google Colab. (If your PDF viewer makes the URL clickable and lands on a broken page, copy the full text manually -- the viewer may have truncated the link at a line break.)

https://colab.research.google.com/github/KarAnalytics/code_demos/blob/main/MultiAgent_DB_multi_human_input.ipynb