In this work, we propose Adaptive Query Reasoning (AdaQR) , a hybrid query rewriting framework based on the observation that for part of queries, the semantic transformation induced by LLM reasoning manifest as systematic, structured transformation in the embedding. Within this framework, a Reasoner Router dynamically directs each query to either fast dense reasoning or deep LLM reasoning. The dense reasoning is achieved by the Dense Reasoner, which performs LLM-style reasoning directly in the embedding space, enabling a controllable trade-off between efficiency and accuracy. Experiments on large-scale retrieval benchmarks BRIGHT show that AdaQR reduces reasoning cost by 28% while preserving—or even improving—retrieval performance by 7% across 17 widely-used LLMs and 5 embedding models.
Install dependencies:
❯ git clone https://github.com/maple826/AdaQR
❯ cd AdaQR
❯ conda env create -f environment.yml
❯ conda activate AdaQR# Download Embdding Models, take BGE-M3 as an example.
> huggingface-cli download --resume-download BAAI/bge-m3 --local-dir ./bge-m3
# Download BRIGHT Benchmark Datasets
> huggingface-cli download --repo-type dataset --resume-download xlangai/BRIGHT --local-dir ./BRIGHTYou can download our queries rewritten by 17 widely-used LLMs in BEIGHT and StackExchange from BRIGHT_reasoned and StackExchange_reasoned and place them in the AdaQR/BRIGHT directory after unzipping.
Including both pre-training and fine-tuning stage
> python run_DenseReasoner.py --rewrite_llm deepseekr1 --embedding_model bge-m3Train Dense Reasoner first and then run AdaQR
> python run_DenseReasoner.py --rewrite_llm deepseekr1 --embedding_model bge-m3
> python run_AdaQR.py --rewrite_llm deepseekr1 --embedding_model bge-m3 --dataset BRIGHT --threshold
| rewrite_llm | The name of the LLM to be used for query rewriting. Currently, we support 17 widely-used LLMs:["deepseekr1", "deepseekv3", "glm4", "glmz1", "kimi", "llama8b", "llama70b", "mixtral7b", "mixtral8x7b" "qwen4b", "qwen8b", "qwen14b", "qwen32b", "r1_llama70b", "r1_qwen7b", "r1_qwen14b", "r1_qwen32b"] |
| embedding_model | The name of the embedding model to be used for query embedding. Currently, we support 5 embedding models: ["bge-large-en-v1.5", "bge-m3", "Qwen3-Embedding-0.6B", "Qwen3-Embedding-4B", 'ReasonIR-8B"] |
| threshold | The threshold for Router Reasoner in AdaQR. We set 0.75 for BGE-Large, 0.7 for BGE-M3 and ReasonIR-8B, 0.6 for Qwen3-Embedding-0.6B and Qwen3-Embedding-4B |
We follow the implementation of xlang-ai/BRIGHT for evaluation.
If you find our work helpful, please cite us:


