Source code for paper "Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models".
If you find this project useful, feel free to ⭐️ it and give it a citation!
Auto-RAG is an autonomous iterative retrieval model centered on the LLM's powerful decision-making capabilities. Auto-RAG models the interaction between the LLM and the retriever through multi-turn dialogue, employs iterative reasoning to determine when and what to retrieve, ceasing the iteration when sufficient external knowledge is available, and subsequently providing the answer to the user.
- GUI interaction: We provide a deployable user interaction interface. After inputting a question, Auto-RAG autonomously engages in interaction with the retriever without any human intervention. Users have the option to decide whether to display the details of the interaction between Auto-RAG and the retriever.
- To interact with Auto-RAG in your browser, follow the guide for GUI interaction.
We provide trained Auto-RAG models using the synthetic data. Please refer to https://huggingface.co/ICTNLP/Auto-RAG-Llama-3-8B-Instruct.
For retriever, we utilize e5-base-v2, following FlashRAG.
To deploy Auto-RAG, retrieval corpus is required. You can download from official website or our processed corpus (which will be uploaded soon), and following FlashRAG to build your own index.
- Clone Auto-RAG's repo.
git clone https://github.com/ictnlp/Auto-RAG.git
cd Auto-RAG
export ROOT=pwd
- Environment requirements: Python 3.12, Gradio 5.1.0
conda env create -f environment.yml
- Download indexes and corpus
We used the following dump version: https://archive.org/download/enwiki-20181220/enwiki-20181220-pages-articles.xml.bz2. Please follow the FlashRAG to process and index it.
We use vLLM to deploy the model for inference. You can update the parameters in vllm.sh to adjust the GPU and model path configuration, then execute:
bash vllm.sh
To interact with Auto-RAG in your browser, you should firstly download the trained Auto-RAG Models and prepare for retrieval corpus.
cd $ROOT/webui
CUDA_VISIBLE_DEVICES=0,1,2,3 python webui.py\
--main_model {model_name}\
--main_model_url {main_model_url}\
--dense_corpus_path {dense_corpus_path}\
--dense_index_path {dense_index_path}
Tip
The interaction process between Auto-RAG and the retriever can be optionally displayed by adjusting a toggle.
Note
Experimental results show that Auto-RAG outperforms all baselines across six benchmarks.
If this repository is useful for you, please cite as:
@article{yu2024autorag,
title={Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models},
author={Tian Yu and Shaolei Zhang and Yang Feng},
year={2024},
eprint={2411.19443},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2411.19443},
}
If you have any questions, feel free to contact [email protected]
.