Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Multi-Agent System: SQL vs. RAG for Business Database Queries

University of Kansas School of Business

This notebook demonstrates a multi-agent system where four specialized LLM agents work together to turn an Excel spreadsheet into a queryable database and answer natural-language business questions in two different ways — then compare the results.

The four agents:

AgentRoleOutput
Agent 1: Data ArchitectReads Excel sheets and proposes a logical database schema (tables, primary/foreign keys)Bulleted schema description
Agent 2: SQL DeveloperConverts the logical schema into SQL DDL, creates the SQLite DB, and populates it from the Excel dataSQL CREATE TABLE statements + populated DB
Agent 3: SQL Query WriterGiven a business question, writes a SQL SELECT query and executes it against the databaseStructured SQL result rows
Agent 4: RAG AnalystGiven the same business question, answers it by reading the raw data directly (no SQL)Natural-language answer
Agent 5: EvaluatorCompares the SQL result and RAG answer, returns a similarity scorePercentage score (0–100%)

Why two approaches? SQL is precise but requires structured data and correct queries. RAG is flexible but may miss details or hallucinate. Comparing them on the same question shows the tradeoffs clearly.

Workflow:

  • Phase 1 (one-time setup): Upload an Excel file → Agents 1 and 2 build the database

  • Phase 2 (repeat as needed): Submit natural-language queries → Agents 3, 4, and 5 run on each query

You can run many queries in Phase 2 without redoing the Phase 1 setup.

# 1. INSTALL DEPENDENCIES
!pip install -q gradio openai pandas chromadb openpyxl langchain-community langchain-openai git+https://github.com/KarAnalytics/llm_cascade.git sentence-transformers

The cell above installs the full toolkit for the multi-agent system. Beyond the basics (pandas, sqlite3), we add chromadb for vector storage (used by the RAG agent), langchain-community and langchain-openai for embedding utilities, sentence-transformers for local embedding models, and llm_cascade for provider-agnostic LLM calls. The multi-agent version requires more libraries than the single-agent notebook because it runs two parallel answer pathways -- SQL and RAG -- plus an evaluator to compare them.

Multi-Agent System: SQL vs. RAG for Business Database Queries

Why would you use five agents when one could do the job? The answer lies in a principle borrowed from software engineering: separation of concerns. Each agent in this notebook has a narrow, well-defined role and a system prompt tailored to that role. The Data Architect thinks about table relationships and primary keys. The SQL Developer thinks about data types and CREATE TABLE syntax. The SQL Query Writer thinks about JOINs and WHERE clauses. The RAG Analyst ignores SQL entirely and reads the raw data. And the Evaluator acts as an impartial judge, comparing the two answers without knowing which approach produced which.

This specialization offers two concrete benefits. First, each agent’s system prompt can be short, focused, and easy to debug -- if the SQL is wrong, you know exactly which agent and which prompt to inspect. Second, by running two independent answer pathways (SQL and RAG) and comparing them, the system catches mistakes that either approach alone would miss. When the SQL answer says “total revenue is USD 1.2M” and the RAG answer says “USD 1.2M,” you have high confidence the number is correct. When they disagree, that disagreement itself is valuable information.

The five agents:

AgentRoleWhat it specializes in
Agent 1: Data ArchitectReads Excel sheets, proposes a logical schemaUnderstanding data structure, primary/foreign keys
Agent 2: SQL DeveloperConverts the logical schema into DDL and creates the databaseSQL syntax, data types, table creation
Agent 3: SQL Query WriterWrites and executes SQL SELECT queries for business questionsQuery optimization, JOINs, aggregations
Agent 4: RAG AnalystAnswers the same question by reading raw data (no SQL)Natural language reasoning over tabular data
Agent 5: EvaluatorCompares SQL and RAG answers, returns a similarity scoreLLM-as-a-judge pattern, cross-validation

Workflow:

  • Phase 1 (one-time setup): Upload an Excel file -> Agents 1 and 2 build the database

  • Phase 2 (repeat as needed): Submit natural-language queries -> Agents 3, 4, and 5 run on each query

Imports and Setup

We use pandas to read Excel, sqlite3 for the database, chromadb (for future vector RAG), and llm_cascade for automatic LLM provider fallback across 8 free-tier providers.

import os
import sqlite3
import pandas as pd
import chromadb
from llm_cascade import get_cascade

llm = get_cascade()

# Global database config
DB_NAME = 'agent_database.db'
CHROMA_CLIENT = chromadb.Client()
LLM Cascade - available providers:
  + Gemini           model=gemini-2.5-flash
  + Ollama           model=kimi-k2.5:cloud
  + Groq             model=llama-3.3-70b-versatile
  + HuggingFace      model=meta-llama/Llama-3.3-70B-Instruct
  + Cohere           model=command-a-03-2025
  + OpenRouter       model=meta-llama/llama-3.3-70b-instruct:free
  + OpenAI           model=gpt-4o-mini
Not configured (skipped):
  - Grok (xAI)       (set XAI_API_KEY)

Agent 1: Data Architect

Role: Read each Excel sheet, look at the column names and data types, and propose a logical database schema in plain English.

How it works:

  • Reads each sheet with pandas

  • Extracts column names and dtypes

  • Sends this metadata to the LLM with a prompt asking for a bulleted schema description (primary keys, foreign keys, relationships)

  • The LLM returns a human-readable schema — no SQL yet

The logical schema serves as the shared plan that Agent 2 (SQL Developer) will implement.

def agent_1_ingest(file_path):
    '''Agent 1 -- Data Architect: reads Excel, proposes logical schema.'''
    xl = pd.ExcelFile(file_path)
    sheet_data = ''
    data_map = {}

    for sheet in xl.sheet_names:
        df = pd.read_excel(file_path, sheet_name=sheet)
        data_map[sheet] = df
        cols = ', '.join([f'{col} ({dtype})' for col, dtype in zip(df.columns, df.dtypes)])
        sheet_data += f'Sheet Name: {sheet}' + chr(10) + f'Columns detected: {cols}' + chr(10) + chr(10)

    prompt = (
        'Analyze these sheets and provide a logical database schema. '
        'Use ONLY bullet points. Identify suggested Primary Keys and Foreign Keys:' + chr(10) + sheet_data
    )
    system_prompt = 'You are a Data Architect. Output a logical schema in bulleted text. Do NOT write SQL code.'
    response = llm.generate(prompt, system_prompt=system_prompt)
    return response.text, data_map


print('Agent 1 (Data Architect) ready.')
Agent 1 (Data Architect) ready.

Agent 2: SQL Developer

Role: Take the logical schema from Agent 1 and turn it into an actual SQLite database populated with the Excel data.

How it works:

  • Prompts the LLM to generate SQL CREATE TABLE DDL statements from the logical schema

  • Executes the DDL against a fresh SQLite database

  • Uses pandas to_sql() to populate each table from the corresponding Excel sheet

  • Returns the DDL text and a status message

After this agent runs, we have a queryable SQLite database ready for business questions.

def agent_2_sql_developer(logical_schema, data_map):
    '''Agent 2 -- SQL Developer: generates DDL, creates DB, populates tables.'''
    prompt = f'Based on this logical schema, write SQLite CREATE TABLE statements:' + chr(10) + logical_schema
    system_prompt = 'Return ONLY raw SQL CREATE TABLE statements. No explanation.'
    response = llm.generate(prompt, system_prompt=system_prompt)
    ddl_resp = response.text.replace('```sql', '').replace('```', '')

    # Execute DDL and insert data
    if os.path.exists(DB_NAME):
        os.remove(DB_NAME)
    conn = sqlite3.connect(DB_NAME)
    cursor = conn.cursor()
    try:
        cursor.executescript(ddl_resp)
        for table, df in data_map.items():
            clean_table_name = table.replace(' ', '_')
            df.to_sql(clean_table_name, conn, if_exists='replace', index=False)
        conn.commit()
        status = 'Database Ready: Tables created and data inserted.'
    except Exception as e:
        status = f'Database Error: {str(e)}'
    finally:
        conn.close()

    return ddl_resp, status


print('Agent 2 (SQL Developer) ready.')
Agent 2 (SQL Developer) ready.

Agent 3: SQL Query Writer

Role: Given a natural-language business question, write a SQL SELECT query and execute it against the database.

How it works:

  • Sends the schema + the user’s question to the LLM

  • Asks the LLM to return ONLY the SQL query (no explanation)

  • Strips any markdown code fences

  • Runs the query with pandas and returns the result as a formatted string

Strengths: Precise, verifiable, handles aggregations and joins well. Weaknesses: Fails if the LLM writes bad SQL or misunderstands the schema.

def agent_3_sql_query(question, logical_schema, conn):
    '''Agent 3 -- SQL Query Writer: writes and executes a SQL SELECT for the question.'''
    sql_gen_prompt = (
        'Schema:' + chr(10) + logical_schema + chr(10)
        + 'Query: ' + question + chr(10)
        + 'Generate ONLY the SQL SELECT statement. Table names might have underscores.'
    )
    response = llm.generate(sql_gen_prompt)
    sql_query = response.text.replace('```sql', '').replace('```', '').strip()

    try:
        result_df = pd.read_sql_query(sql_query, conn)
        return sql_query, result_df.to_string()
    except Exception as e:
        return sql_query, f'SQL Execution Error: {str(e)}'


print('Agent 3 (SQL Query Writer) ready.')
Agent 3 (SQL Query Writer) ready.

Agent 4: RAG Analyst

Role: Answer the same business question using retrieval-augmented generation — no SQL, just raw data passed into the LLM prompt as context.

How it works:

  • Builds a context string from the first 20 rows of each table (as plain text)

  • Sends that context + the user’s question to the LLM

  • The LLM reads the raw data and answers in natural language

Strengths: Works without valid SQL, handles fuzzy or open-ended questions. Weaknesses: Limited by context window (only sees first 20 rows here), may hallucinate, less precise for numeric aggregations.

def agent_4_rag_analyst(question, data_map):
    '''Agent 4 -- RAG Analyst: answers from raw data passed as context.'''
    context = chr(10).join(
        ['Table ' + k + ':' + chr(10) + v.head(20).to_string() for k, v in data_map.items()]
    )
    rag_prompt = (
        'Data Context:' + chr(10) + context + chr(10)
        + 'User Question: ' + question + chr(10)
        + 'Answer the question based on the data provided.'
    )
    response = llm.generate(rag_prompt)
    return response.text


print('Agent 4 (RAG Analyst) ready.')
Agent 4 (RAG Analyst) ready.

Agent 5: Evaluator

Role: Compare the SQL answer (Agent 3) and the RAG answer (Agent 4) for the same question, and return a similarity score.

How it works:

  • Sends both answers to the LLM

  • Asks it to return ONLY a percentage score (0–100%)

  • High score = both approaches agree; low score = they diverge (worth investigating why)

This is a simple “LLM-as-a-judge” pattern — using an LLM to evaluate other LLM outputs.

def agent_5_evaluator(sql_result, rag_result):
    '''Agent 5 -- Evaluator: compares SQL and RAG answers, returns a similarity score.'''
    eval_prompt = (
        'Compare these two answers and provide ONLY a Similarity Score (0-100%).' + chr(10)
        + 'Answer A: ' + sql_result + chr(10)
        + 'Answer B: ' + rag_result
    )
    system_prompt = 'You are a calculator. Output only a percentage score.'
    response = llm.generate(eval_prompt, system_prompt=system_prompt)
    return response.text


print('Agent 5 (Evaluator) ready.')
Agent 5 (Evaluator) ready.

Interactive App: Upload Excel and Ask Questions

The cell below launches a Gradio app that gives you an interactive UI for the multi-agent system:

  • Step 1: Upload your Excel file and click Build Database. Agents 1 and 2 run: the Data Architect designs a logical schema, and the SQL Developer writes the DDL and creates a SQLite database.

  • Step 2: Type a business question and click Submit Query (or press Enter). Agents 3, 4, and 5 run: the SQL Query Writer generates SQL, the RAG Analyst answers from raw data, and the Evaluator compares the two. Each Q&A is added to the chat history.

  • Repeat Step 2 as many times as you want — no need to re-upload the file.

If you do not have an Excel file on hand, download this sample dataset and upload it: SUPPLIER_PARTS.xlsx (click Download raw file on the GitHub page).

This is the classroom-friendly way to compare SQL vs. RAG live. Students can submit their own questions and watch the agents disagree (or agree) in real time.

import gradio as gr

# Module-level state so the Gradio UI can persist the schema across queries
_app_state = {'logical_schema': None, 'data_map': None, 'db_ready': False}


def ui_build_database(file):
    '''Phase 1: run Agents 1 and 2 on the uploaded file.'''
    if file is None:
        return '', '', 'Please upload an Excel file first.'

    try:
        logical_schema, data_map = agent_1_ingest(file.name)
        ddl, status = agent_2_sql_developer(logical_schema, data_map)

        _app_state['logical_schema'] = logical_schema
        _app_state['data_map'] = data_map
        _app_state['db_ready'] = 'Ready' in status
        return logical_schema, ddl, status
    except Exception as e:
        return '', '', f'Error building database: {e}'


def ui_ask_question(question, history):
    '''Phase 2: run Agents 3, 4, 5 on a single question.'''
    if not _app_state['db_ready']:
        bot_msg = 'Please upload an Excel file and click Build Database first.'
        return history + [[question, bot_msg]], ''

    if not question or not question.strip():
        return history, question

    try:
        conn = sqlite3.connect(DB_NAME)
        try:
            sql_query, sql_result = agent_3_sql_query(question, _app_state['logical_schema'], conn)
        finally:
            conn.close()

        rag_result = agent_4_rag_analyst(question, _app_state['data_map'])
        similarity = agent_5_evaluator(sql_result, rag_result)

        response = (
            '**Agent 3 (SQL):**' + chr(10) + '```sql' + chr(10) + sql_query + chr(10) + '```' + chr(10) + chr(10)
            + '**SQL Result:**' + chr(10) + '```' + chr(10) + sql_result + chr(10) + '```' + chr(10) + chr(10)
            + '**Agent 4 (RAG):** ' + rag_result + chr(10) + chr(10)
            + '**Agent 5 (Similarity):** ' + similarity
        )
    except Exception as e:
        response = f'Error: {e}'

    return history + [[question, response]], ''


with gr.Blocks(title='Multi-Agent DB') as demo:
    gr.Markdown('# Multi-Agent: Excel to SQLite to SQL vs. RAG Comparison')

    gr.Markdown('### Step 1: Upload Excel file and build the database (Agents 1 & 2)')
    with gr.Row():
        file_input = gr.File(label='Excel File', file_types=['.xlsx', '.xls'])
        build_btn = gr.Button('Build Database', variant='primary')

    with gr.Row():
        with gr.Column():
            out_schema = gr.Markdown(label='Logical Schema (Agent 1)')
        with gr.Column():
            out_ddl = gr.Code(label='SQL DDL (Agent 2)', language='sql')
    out_status = gr.Textbox(label='Status', interactive=False)

    build_btn.click(
        ui_build_database,
        inputs=[file_input],
        outputs=[out_schema, out_ddl, out_status],
    )

    gr.Markdown('### Step 2: Ask business questions (Agents 3, 4 & 5 run per query)')
    chatbot = gr.Chatbot(label='Q&A History', height=400)
    query_input = gr.Textbox(
        label='Your question',
        placeholder='e.g., How many total sales? Who is the top employee?',
    )
    submit_btn = gr.Button('Submit Query', variant='primary')

    submit_btn.click(
        ui_ask_question,
        inputs=[query_input, chatbot],
        outputs=[chatbot, query_input],
    )
    query_input.submit(
        ui_ask_question,
        inputs=[query_input, chatbot],
        outputs=[chatbot, query_input],
    )

demo.launch(debug=False)
/tmp/ipykernel_1601/2530079974.py:77: UserWarning: You have not specified a value for the `type` parameter. Defaulting to the 'tuples' format for chatbot messages, but this is deprecated and will be removed in a future version of Gradio. Please set type='messages' instead, which uses openai-style dictionaries with 'role' and 'content' keys.
  chatbot = gr.Chatbot(label='Q&A History', height=400)
/tmp/ipykernel_1601/2530079974.py:77: DeprecationWarning: The default value of 'allow_tags' in gr.Chatbot will be changed from False to True in Gradio 6.0. You will need to explicitly set allow_tags=False if you want to disable tags in your chatbot.
  chatbot = gr.Chatbot(label='Q&A History', height=400)
It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://a196d879b4ffc4d788.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
Loading...

Key Takeaways

What we demonstrated:

  • A multi-agent system where each agent has a narrow, well-defined role

  • Agents pass outputs to each other (logical schema → DDL → populated DB → queries)

  • The same business question answered in two ways (SQL vs. RAG) with automatic comparison

  • Separation of setup phase (expensive, run once) from query phase (cheap, repeat many times)

When to use SQL vs. RAG on structured data:

SQL (Agent 3)RAG (Agent 4)
Numeric aggregationsExcellentOften wrong
Fuzzy/open-ended questionsLimitedBetter
Joins across tablesExcellentStruggles
Context size limitNoneYes
Requires valid schemaYesNo
PrecisionHighVariable

In practice, systems often use both: SQL for precise numeric queries, RAG for explanatory or fuzzy questions — and an LLM-as-a-judge (Agent 5) to flag when they disagree.

Exercises:

  • Add a new agent that generates a chart from the SQL result

  • Extend Agent 4 to use vector embeddings (ChromaDB) instead of raw context

  • Add an Agent 6 that explains WHY the two answers disagreed (when similarity is low)

  • Replace llm_cascade with a specific provider and see how different LLMs perform at each agent role


Run the code

To run this notebook, copy the URL below into your browser’s address bar. The link opens the notebook directly in Google Colab. (If your PDF viewer makes the URL clickable and lands on a broken page, copy the full text manually -- the viewer may have truncated the link at a line break.)

https://colab.research.google.com/github/KarAnalytics/code_demos/blob/main/MultiAgent_DB_multi_human_input.ipynb