🔥 Updates

📖Introduction

This is the official repository for the paper "MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL".

In this paper, we propose a multi-agent collaborative Text-to-SQL framework MAC-SQL, which comprises three agents: the Selector, the Decomposer, and the Refiner.

🔥 Updates

[2024.11] Our work has been accepted by COLING 2025 conference paper version. Welcome to cite this paper version.
[2024.04] We have updated the sql-llama-instruct-v0.5.jsonl and training scripts in training_scripts dir of this project. Please check it out.Download the sql-llama-data.zip from Baidu Dsik or Google Drive. Unzip sql-llama-data.zip and get the data dir, which contains sql-llama-instruct-v0.5.jsonl (3375 instances).
[2024.04] We have updated the SQL-Llama-v0.5 model and data.zip (update dev_gold_schema.json in bird and spider) The download links of the updated data are available on Baidu Disk and Google Drive.
[2024.02] We have updated the paper, with updates mainly focusing on experiments and framework details, check it out! link.
[2023.12] We have updated the paper, with updates mainly focusing on the title, abstract, introduction, some details, and appendix. In addition, we give some bad case examples on bad_cases folder, check it out!
[2023.12] We released our first version paper, code. Check it out!

⚡Environment

Config your local environment.

conda create -n macsql python=3.9 -y
conda activate macsql
pip install -r requirements.txt
python -c "import nltk; nltk.download('punkt')"

Note: we use openai==0.28.1, which use openai.ChatCompletion.create to call api.

Edit openai config at core/api_config.py, and set related environment variables of Azure OpenAI API.

Currently, we use gpt-4-1106-preview (128k version) by default, which is 2.5 times less expensive than the gpt-4 (8k) on average.

export OPENAI_API_BASE="YOUR_OPENAI_API_BASE"
export OPENAI_API_KEY="YOUR_OPENAI_API_KEY"

🔧 Data Preparation

In order to prepare the data more quickly, I have packaged the files including the databases of the BIRD dataset and the Spider dataset into data.zip and uploaded them. All files were downloaded on December 19, 2023, ensuring they are the latest version at that moment. The download links are available on Baidu Disk and Google Drive(update on 2024-04-22).

After downloading the data.zip file, you should delete the existing data folder in the project directory and replace it with the unzipped data folder from data.zip.

🚀 Run

The run script will first run 5 examples in Spider to check environment. You should open code comments for different usage.

run.sh for Linux/Mac OS
run.bat for Windows OS

For SQL execution demo, you can use app_bird.py or app_spider.py to get the execution result of your SQL query.

cd ./scripts
python app_bird.py
python app_spider.py

If occur error /bin/bash^M: bad interpreter in Linux, use sed -i -e 's/\r$//' run.sh to solve it.

📝Evaluation Dataset

We evaluate our method on both BIRD dataset and Spider dataset.

EX: Execution Accuracy(%)

VES: Valid Efficiency Score(%)

Refer to our paper for the details.

🫡Run SQL-Llama

Download the SQL-Llama(current v0.5 version) and follow the SQL-Llama-deployment.md to deploy.

Uncomment the MODEL_NAME = 'CodeLlama-7b-hf' in core/api_config.py to set the global model and comment other MODEL_NAME = xxx lines.

Uncomment the export OPENAI_API_BASE='http://0.0.0.0:8000/v1' in run.sh to set the local model api base.

Then, run run.sh to start your local inference.

🌟 Project Structure

├─data # store datasets and databases
|  ├─spider
|  ├─bird
├─core
|  ├─agents.py       # define three agents class
|  ├─api_config.py   # OpenAI API ENV config
|  ├─chat_manager.py # manage the communication between agents
|  ├─const.py        # prompt templates and CONST values
|  ├─llm.py          # api call function and log print
|  ├─utils.py        # utils function
├─scripts            # sqlite execution flask demo
|  ├─app_bird.py
|  ├─app_spider.py
|  ├─templates
├─evaluation # evaluation scripts
|  ├─evaluation_bird_ex.py
|  ├─evaluation_bird_ves.py
|  ├─evaluation_spider.py
├─bad_cases
|  ├─badcase_BIRD(dev)_examples.xlsx
|  └badcase_Spider(dev)_examples.xlsx
├─evaluation_bird_ex_ves.sh # bird evaluation script
├─README.md
├─requirements.txt
├─run.py # main run script
├─run.sh # generation and evaluation script

💬Citation

If you find our work is helpful, please cite as:

@inproceedings{macsql-2025,
  title={MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL},
  author={Wang, Bing and Ren, Changyu and Yang, Jian and Liang, Xinnian and Bai, Jiaqi and Chai, Linzheng and Yan, Zhao and Zhang, Qian-Wen and Yin, Di and Sun, Xing and others},
  booktitle={Proceedings of the 31st International Conference on Computational Linguistics},
  pages={540--557},
  year={2025}
}

👍Contributing

We welcome contributions and suggestions!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📖Introduction

🔥 Updates

⚡Environment

🔧 Data Preparation

🚀 Run

📝Evaluation Dataset

🫡Run SQL-Llama

🌟 Project Structure

💬Citation

👍Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
assets		assets
bad_cases		bad_cases
core		core
data		data
evaluation		evaluation
scripts		scripts
training_scripts		training_scripts
.gitignore		.gitignore
README.md		README.md
SQL-Llama-deployment.md		SQL-Llama-deployment.md
evaluation_bird_ex_ves.bat		evaluation_bird_ex_ves.bat
evaluation_bird_ex_ves.sh		evaluation_bird_ex_ves.sh
requirements.txt		requirements.txt
run.bat		run.bat
run.py		run.py
run.sh		run.sh

wbbeyourself/MAC-SQL

Folders and files

Latest commit

History

Repository files navigation

📖Introduction

🔥 Updates

⚡Environment

🔧 Data Preparation

🚀 Run

📝Evaluation Dataset

🫡Run SQL-Llama

🌟 Project Structure

💬Citation

👍Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages