-
Notifications
You must be signed in to change notification settings - Fork 1
RAG
Albert Gong edited this page Apr 30, 2025
·
6 revisions
PhantomEval supports evaluation of RAG methods via FlashRAG and vLLM.
FlashRAG should automatically be installed by pip install phantom-wiki[eval]
. Assuming it's installed, please follow these steps to generate a corpus:
- To save the corpus in .jsonl format, please run:
python examples/flashrag/save_as_jsonl.py --dataset DATASET --split_list SPLIT_LIST
SPLIT_LIST can be a single split name or a list of split names. If loading a dataset from a local directory (e.g., wiki-v1-easy), additionally pass in the --from_local
flag.
This script will generate a .jsonl file at corpus/<split>.jsonl
for each split in SPLIT_LIST.
- To construct a BM25 index, please run:
python -m flashrag.retriever.index_builder --retrieval_method bm25 --corpus_path corpus/<split>.jsonl --bm25_backend bm25s --save_dir SAVE_DIR
Tip
As a convention, set SAVE_DIR to be indexes/<split>