Skip to content

More2Search/Awesome-Search-LLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 

Repository files navigation

Awesome Tree Search Algorithms for LLM


arXiv Citation License: MIT

Welcome to the "Awesome Search" repository !

This repository, accompanying our paper, provides a comprehensive review and unified framework for tree search-based methods, demonstrating how these algorithms are revolutionizing LLM test-time reasoning with scalable and efficient problem-solving solutions.

Dive into this repository to explore how innovative search-based methods like MCTS are reshaping reasoning capabilities in LLMs!

🔔 🔔 🔔 For more detailed information, please refer to our paper

✉️ ➡️ 📪 If you have any questions, please feel free to contact us at:

{weijiaqi, yangyuejin}@pjlab.org.cn | [email protected]


Test-time-scaling

👋 Introduction

As the scaling of large language models (LLMs) during training reaches diminishing returns, there has been a shift toward scalable test-time reasoning algorithms. Chain-of-Thought (CoT) reasoning has emerged as a promising approach, enabling intermediate reasoning steps in text space. However, traditional CoT methods suffer from single-path exploration, which limits their ability to fully explore complex reasoning spaces.

To address this limitation, recent works have adopted tree search-based reasoning frameworks, inspired by classical search algorithms such as Depth-First Search (DFS), Breadth-First Search (BFS), and Monte Carlo Tree Search (MCTS). These methods demonstrate significant potential in balancing exploration and exploitation, enabling LLMs to efficiently solve complex tasks at test time.

Tree Search Evolution

This repository provides a comprehensive framework for tree search-based reasoning in LLMs, aiming to unify and advance the field. Our primary contributions include:

  1. A Unified Formalism:
    We propose a structured mathematical framework to analyze and compare tree search algorithms, focusing on their core mechanisms, reasoning reward formulations, and application domains. Specifically, we formalize the role of "reward" as a transient guidance signal in test-time search.

  2. A Systematic Taxonomy:
    We categorize existing search algorithms along three primary axes:

    • The search mechanism (e.g., DFS, BFS, MCTS)
    • The reward formulation
    • The application domain

    This taxonomy provides clarity for researchers and practitioners navigating this evolving field.

  3. A Synthesis of Applications and Future Directions:
    We map the primary applications of tree search reasoning, including mathematical reasoning, data generation, and optimization. Additionally, we highlight key areas for future research, such as improving general-purpose reasoning capabilities and enhancing scalability.

Tree Search Formalism

Our survey highlights the transformative potential of tree search-based reasoning frameworks in overcoming the limitations of traditional CoT methods. By providing a unified formalism, systematic taxonomy, and practical insights, we aim to establish a robust foundation for advancing LLM test-time reasoning.

For more details, please refer to our full paper or explore the examples and implementations provided in this repository.

📑 Contents


Part 1: MCTS for Direct Inference-Time Enhancement

General Reasoning & Problem Solving

  • Tree of Thoughts: Deliberate Problem Solving with Large Language Models, Muralidharan, Jananee and Thomas, Tiju arXiv 2023
  • Reasoning with language model is planning with world model, Hao, Shibo et al. arXiv 2023
  • AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training, Feng, Xidong et al. arXiv 2023
  • Towards self-improvement of llms via mcts: Leveraging stepwise knowledge with curriculum preference learning, Wang, Xiyao et al. arXiv 2024
  • PPL-MCTS: Constrained textual generation through discriminator-guided MCTS decoding, Chaffin, Antoine et al. arXiv 2021
  • Don't throw away your value model! Generating more preferable text with Value-Guided Monte-Carlo Tree Search decoding, Liu, Jiacheng et al. arXiv 2023
  • Args: Alignment as reward-guided search, Khanov, Maxim et al. arXiv 2024
  • When is tree search useful for llm planning? it depends on the discriminator, Chen, Ziru et al. arXiv 2024
  • Interpretable contrastive monte carlo tree search reasoning, Gao, Zitian et al. arXiv 2024
  • Synthetic data generation from real data sources using monte carlo tree search and large language models, Locowic, Leonardo et al. Authorea 2024

Mathematical Reasoning

  • ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search, Zhang et al. arXiv 2024
  • Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing, Tian et al. NeurIPS 2024
  • Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning, Xie et al. arXiv 2024
  • Accessing GPT-4 Level Mathematical Olympiad Solutions via Monte Carlo Tree Self-Refine with LLaMa-3 8B, Zhang et al. arXiv 2024
  • No Train Still Gain. Unleash Mathematical Reasoning of Large Language Models with Monte Carlo Tree Search Guided by Energy Function, Xu et al. arXiv 2023
  • Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers, Qi et al. arXiv 2024
  • AlphaMath Almost Zero: Process Supervision Without Process, Chen et al. NeurIPS 2024
  • LiteSearch: Efficacious Tree Search for LLM, Wang et al. arXiv 2024
  • Markov Chain of Thought for Efficient Mathematical Reasoning, Yang et al. arXiv 2024
  • OVM, Outcome-Supervised Value Models for Planning in Mathematical Reasoning, Yu et al. arXiv 2023
  • MindStar: Enhancing Math Reasoning in Pre-Trained LLMs at Inference Time, Kang et al. arXiv 2024
  • LLaMA-Berry: Pairwise Optimization for Olympiad-Level Mathematical Reasoning via O1-like Monte Carlo Tree Search, Zhang et al. NAACL 2025
  • Beyond Examples: High-Level Automated Reasoning Paradigm in In-Context Learning via MCTS, Wu et al. arXiv 2024
  • BoostStep: Boosting Mathematical Capability of Large Language Models via Improved Single-Step Reasoning, Zhang et al. arXiv 2025
  • Step-Level Value Preference Optimization for Mathematical Reasoning, Chen et al. arXiv 2024
  • Improve Mathematical Reasoning in Language Models by Automated Process Supervision, Luo et al. arXiv 2024
  • What Are Step-Level Reward Models Rewarding? Counterintuitive Findings from MCTS-Boosted Mathematical Reasoning, Ma et al. AAAI 2025
  • Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning, Park et al. arXiv 2024
  • rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking, Guan et al. arXiv 2025
  • Leveraging Constrained Monte Carlo Tree Search to Generate Reliable Long Chain-of-Thought for Mathematical Reasoning, Lin et al. arXiv 2025

Code Generation & Software Engineering

  • Planning With Large Language Models for Code Generation, Zhang et al. arXiv 2023
  • Make Every Move Count: LLM-Based High-Quality RTL Code Generation Using MCTS, DeLorenzo et al. arXiv 2024
  • RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation, Li et al. arXiv 2024
  • VerMCTS: Synthesizing Multi-Step Programs Using a Verifier, a Large Language Model, and Tree Search, Brandfonbrener et al. arXiv 2024
  • Generating Code World Models With Large Language Models Guided by Monte Carlo Tree Search, Dainese et al. NeurIPS 2024
  • Planning In Natural Language Improves LLM Search For Code Generation, Wang et al. arXiv 2024
  • O1-Coder: An O1 Replication for Coding, Zhang et al. arXiv 2024
  • SRA-MCTS: Self-Driven Reasoning Augmentation With Monte Carlo Tree Search for Code Generation, Xu et al. arXiv 2024
  • SWE-Search: Enhancing Software Agents With Monte Carlo Tree Search and Iterative Refinement, Antoniades et al. arXiv 2024
  • PepTune: De Novo Generation of Therapeutic Peptides With Multi-Objective-Guided Discrete Diffusion, Tang et al. PMC 2025
  • BFS-Prover: Scalable Best-First Tree Search for LLM-Based Automatic Theorem Proving, Xin et al. arXiv 2025
  • MCTS-Judge: Test-Time Scaling in LLM-as-a-Judge for Code Correctness Evaluation, Wang et al. arXiv 2025
  • SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution, Li et al. arXiv 2025
  • Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving, Zimmer et al. arXiv 2025
  • APRMCTS: Improving LLM-Based Automated Program Repair With Iterative Tree Search, Hu et al. arXiv 2025
  • MCTS-Refined CoT: High-Quality Fine-Tuning Data for LLM-Based Repository Issue Resolution, Wang et al. arXiv 2025

LLM Agents & Interactive Environments

  • Tree Search for Language Model Agents, Koh et al. arXiv 2024
  • Enhancing Decision-Making for LLM Agents via Step-Level Q-Value Models, Zhai et al. AAAI 2025
  • Large Language Models as Commonsense Knowledge for Large-Scale Task Planning, Zhao et al. NeurIPS 2023
  • Can Large Language Models Play Games? A Case Study of A Self-Play Approach, Guo et al. arXiv 2024
  • Planning with Large Language Models for Conversational Agents, Li et al. arXiv 2024
  • Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning, Yu et al. arXiv 2024
  • REX: Rapid Exploration and eXploitation for AI Agents, Murthy et al. OpenReview 2023
  • SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning, Chi et al. arXiv 2024
  • ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search, Zhuang et al. arXiv 2023
  • Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents, Putta et al. arXiv 2024
  • Information Directed Tree Search: Reasoning and Planning with Language Agents, Chandak et al. NeurIPS 2024
  • LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning, Meng et al. arXiv 2024
  • Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search, Light et al. arXiv 2024
  • Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models, Zhou et al. arXiv 2023
  • Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration, Ye et al. arXiv 2024
  • SAPIENT: Mastering Multi-Turn Conversational Recommendation with Strategic Planning and Monte Carlo Tree Search, Du et al. arXiv 2024
  • WebPilot: A Versatile and Autonomous Multi-Agent System for Web Task Execution with Strategic Exploration, Zhang et al. AAAI 2025
  • A Training Data Recipe to Accelerate A* Search with Language Models, Gupta et al. arXiv 2024
  • Planning Like Human: A Dual-Process Framework for Dialogue Planning, He et al. arXiv 2024
  • Prompt-Based Monte-Carlo Tree Search for Goal-Oriented Dialogue Policy Planning, Yu et al. arXiv 2023
  • Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design, Zheng et al. arXiv 2025
  • Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training, Yuan et al. arXiv 2025
  • MASTER: A Multi-Agent System with LLM Specialized MCTS, Gan et al. arXiv 2025
  • Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search, Shi et al. arXiv 2025
  • SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents, Lin et al. arXiv 2025
  • WebSynthesis: World-Model-Guided MCTS for Efficient WebUI-Trajectory Synthesis, Gao et al. arXiv 2025
  • Mirage-1: Augmenting and Updating GUI Agent with Hierarchical Multimodal Skills, Xie et al. arXiv 2025
  • AgentSwift: Efficient LLM Agent Design via Value-Guided Hierarchical Search, Li et al. arXiv 2025
  • HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM Systems, Hou et al. arXiv 2025
  • AgentXploit: End-to-End Redteaming of Black-Box AI Agents, Wang et al. arXiv 2025

Retrieval-Augmented Generation (RAG) & Knowledge-Intensive Tasks

  • A Novel Approach to Optimize Large Language Models for Named Entity Matching with Monte Carlo Tree Search, Volkova et al. Authorea 2024
  • KNOT-MCTS: An Effective Approach to Addressing Hallucinations in Generative Language Modeling for Question Answering, Wu et al. ROCLING 2023
  • Search-in-the-Chain: Interactively Enhancing Large Language Models with Search for Knowledge-Intensive Tasks, Xu et al. ACM Web Conf 2024
  • Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design, Sprueill et al. arXiv 2023
  • RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement, Jiang et al. arXiv 2024
  • RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models, Tran et al. arXiv 2024
  • CORAG: A Cost-Constrained Retrieval Optimization System for Retrieval-Augmented Generation, Wang et al. arXiv 2024
  • Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling, Li et al. ACL 2025
  • RiTeK: A Dataset for Large Language Models Complex Reasoning Over Textual Knowledge Graphs, Huang et al. arXiv 2024
  • KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection, Choi et al. arXiv 2023
  • AirRAG: Activating Intrinsic Reasoning for Retrieval Augmented Generation Using Tree-Based Search, Feng et al. arXiv 2025
  • MedS3: Towards Medical Small Language Models with Self-Evolved Slow Thinking, Jiang et al. arXiv 2025
  • KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search, Luo et al. arXiv 2025
  • MCTS-KBQA: Monte Carlo Tree Search for Knowledge Base Question Answering, Xiong et al. arXiv 2025
  • Enhancing Test-Time Scaling of Large Language Models with Hierarchical Retrieval-Augmented MCTS, Dou et al. arXiv 2025
  • MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search, Hu et al. arXiv 2025
  • Toward Structured Knowledge Reasoning: Contrastive Retrieval-Augmented Generation on Experience, Gu et al. arXiv 2025
  • FREESON: Retriever-Free Retrieval-Augmented Reasoning via Corpus-Traversing MCTS, Kim et al. arXiv 2025
  • Structural Entropy Guided Agent for Detecting and Repairing Knowledge Deficiencies in LLMs, Wei et al. arXiv 2025

Multimodal Reasoning

  • Mulberry: Empowering mllm with o1-like reasoning and reflection via collective monte carlo tree search, Yao, Huanjin et al. arXiv 2024
  • Progressive multimodal reasoning via active retrieval, Dong, Guanting et al. arXiv 2024
  • Boosting multimodal reasoning with mcts-automated structured thinking, Wu, Jinyang et al. arXiv 2025
  • MMC: Iterative Refinement of VLM Reasoning via MCTS-based Multimodal Critique, Liu, Shuhang et al. arXiv 2025
  • Dyfo: A training-free dynamic focus visual search for enhancing lmms in fine-grained visual understanding, Li, Geng et al. CVPR 2025
  • MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision, Du, Lingxiao et al. arXiv 2025

Part 2: MCTS for Self-Improvement via Data Generation

Foundational Self-Improvement Frameworks

  • AlphaZero-Like Tree-Search Can Guide Large Language Model Decoding and Training, Feng et al. arXiv 2023
  • Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing, Tian et al. NeurIPS 2024
  • Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning, Xie et al. arXiv 2024
  • Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents, Putta et al. arXiv 2024
  • Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers, Qi et al. arXiv 2024
  • AlphaMath Almost Zero: Process Supervision Without Process, Chen et al. NeurIPS 2024
  • CPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning Tasks, Wang et al. arXiv 2024
  • Step-Level Value Preference Optimization for Mathematical Reasoning, Chen et al. arXiv 2024
  • Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning, Wang et al. arXiv 2024
  • RStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking, Guan et al. arXiv 2025
  • Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training, Yuan et al. arXiv 2025
  • Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search, Shi et al. arXiv 2025
  • TreeRPO: Tree Relative Policy Optimization, Yang et al. arXiv 2025
  • MCTS-Refined CoT: High-Quality Fine-Tuning Data for LLM-Based Repository Issue Resolution, Wang et al. arXiv 2025
  • ASTRO: Teaching Language Models to Reason by Reflecting and Backtracking In-Context, Kim et al. arXiv 2025

Applications in Specific Domains

General Capabilities & Alignment

  • PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided MCTS Decoding, Chaffin et al. arXiv 2021
  • Don't Throw Away Your Value Model! Generating More Preferable Text with Value-Guided Monte-Carlo Tree Search Decoding, Liu et al. arXiv 2023
  • ARGS: Alignment as Reward-Guided Search, Khanov et al. arXiv 2024
  • Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning, Yu et al. arXiv 2024
  • PromptAgent: Strategic Planning with Language Models Enables Expert-Level Prompt Optimization, Wang et al. arXiv 2023
  • Dynamic Rewarding with Prompt Optimization Enables Tuning-Free Self-Alignment of Language Models, Singla et al. arXiv 2024
  • Optimizing Instruction Synthesis: Effective Exploration of Evolutionary Space with Tree Search, Li et al. arXiv 2024
  • STAIR: Improving Safety Alignment with Introspective Reasoning, Zhang et al. arXiv 2025
  • APRMCTS: Improving LLM-Based Automated Program Repair with Iterative Tree Search, Hu et al. arXiv 2025

Scientific & Specialized Domains

  • Can Large Language Models Play Games? A Case Study of a Self-Play Approach, Guo et al. arXiv 2024
  • A Novel Approach to Optimize Large Language Models for Named Entity Matching with Monte Carlo Tree Search, Volkova et al. Preprint 2024
  • Synthetic Data Generation from Real Data Sources Using Monte Carlo Tree Search and Large Language Models, Locowic et al. Preprint 2024
  • Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design, Sprueill et al. arXiv 2023
  • Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search, Light et al. arXiv 2024
  • Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking, Cheng et al. arXiv 2025
  • PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion, Tang et al. PMC 2025
  • Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration, Ye et al. arXiv 2024
  • What Are Step-Level Reward Models Rewarding? Counterintuitive Findings from MCTS-Boosted Mathematical Reasoning, Ma et al. AAAI 2025
  • Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning, Park et al. arXiv 2024
  • Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling, Li et al. arXiv 2024
  • SAPIENT: Mastering Multi-Turn Conversational Recommendation with Strategic Planning and Monte Carlo Tree Search, Du et al. arXiv 2024
  • Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design, Zheng et al. arXiv 2025
  • Prompt-Based Monte Carlo Tree Search for Mitigating Hallucinations in Large Models, Duan et al. arXiv 2025
  • MedS3: Towards Medical Small Language Models with Self-Evolved Slow Thinking, Jiang et al. arXiv 2025
  • Lemma: Learning from Errors for Mathematical Advancement in LLMs, Pan et al. arXiv 2025
  • Towards Stepwise Domain Knowledge-Driven Reasoning Optimization and Reflection Improvement, Liu et al. arXiv 2025
  • Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data, Zou et al. arXiv 2025
  • Iris: Interactive Research Ideation System for Accelerating Scientific Discovery, Garikaparthi et al. arXiv 2025
  • Monte Carlo Planning with Large Language Model for Text-Based Game Agents, Shi et al. arXiv 2025
  • MCTSr-Zero: Self-Reflective Psychological Counseling Dialogues Generation via Principles and Adaptive Exploration, Lu et al. arXiv 2025

Multimodal Applications

  • Can Large Language Models Play Games? A Case Study of a Self-Play Approach, Guo et al. arXiv 2024
  • A Novel Approach to Optimize Large Language Models for Named Entity Matching with Monte Carlo Tree Search, Volkova et al. Preprint 2024
  • Synthetic Data Generation from Real Data Sources Using Monte Carlo Tree Search and Large Language Models, Locowic et al. Preprint 2024
  • Monte Carlo Thought Search: Large Language Model Querying for Complex Scientific Reasoning in Catalyst Design, Sprueill et al. arXiv 2023
  • Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search, Light et al. arXiv 2024
  • Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking, Cheng et al. arXiv 2025
  • PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion, Tang et al. PMC 2025
  • Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration, Ye et al. arXiv 2024
  • What Are Step-Level Reward Models Rewarding? Counterintuitive Findings from MCTS-Boosted Mathematical Reasoning, Ma et al. AAAI 2025
  • Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning, Park et al. arXiv 2024
  • Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling, Li et al. arXiv 2024
  • SAPIENT: Mastering Multi-Turn Conversational Recommendation with Strategic Planning and Monte Carlo Tree Search, Du et al. arXiv 2024
  • Monte Carlo Tree Search for Comprehensive Exploration in LLM-Based Automatic Heuristic Design, Zheng et al. arXiv 2025
  • Prompt-Based Monte Carlo Tree Search for Mitigating Hallucinations in Large Models, Duan et al. arXiv 2025
  • MedS3: Towards Medical Small Language Models with Self-Evolved Slow Thinking, Jiang et al. arXiv 2025
  • Lemma: Learning from Errors for Mathematical Advancement in LLMs, Pan et al. arXiv 2025
  • Towards Stepwise Domain Knowledge-Driven Reasoning Optimization and Reflection Improvement, Liu et al. arXiv 2025
  • Trans-Zero: Self-Play Incentivizes Large Language Models for Multilingual Translation Without Parallel Data, Zou et al. arXiv 2025
  • Iris: Interactive Research Ideation System for Accelerating Scientific Discovery, Garikaparthi et al. arXiv 2025
  • Monte Carlo Planning with Large Language Model for Text-Based Game Agents, Shi et al. arXiv 2025
  • MCTSr-Zero: Self-Reflective Psychological Counseling Dialogues Generation via Principles and Adaptive Exploration, Lu et al. arXiv 2025
  • SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement, Wang et al. arXiv 2025
  • MMC: Iterative Refinement of VLM Reasoning via MCTS-Based Multimodal Critique, Liu et al. arXiv 2025
  • MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision, Du et al. arXiv 2025

Part 3: Advanced Topics and Hybrid Approaches

Multi-Agent and Collaborative Search

  • MASTER: A Multi-Agent System with LLM Specialized MCTS, Gan et al. arXiv 2025
  • SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution, Li et al. arXiv 2025
  • Multi-LLM Collaborative Search for Complex Problem Solving, Yang et al. arXiv 2025
  • HALO: Hierarchical Autonomous Logic-Oriented Orchestration for Multi-Agent LLM Systems, Hou et al. arXiv 2025

Reward Model Design and Optimization

  • OVM, Outcome-Supervised Value Models for Planning in Mathematical Reasoning, Yu et al. arXiv 2023
  • Let’s Reward Step by Step: Step-Level Reward Model as the Navigators for Reasoning, Ma et al. arXiv 2023
  • Improve Mathematical Reasoning in Language Models by Automated Process Supervision, Luo et al. arXiv 2024
  • What Are Step-Level Reward Models Rewarding? Counterintuitive Findings from MCTS-Boosted Mathematical Reasoning, Ma et al. AAAI 2025
  • Your Reward Function for RL Is Your Best PRM for Search: Unifying RL and Search-Based TTS, Jin et al. arXiv 2025
  • MT-RewardTree: A Comprehensive Framework for Advancing LLM-Based Machine Translation via Reward Modeling, Feng et al. arXiv 2025
  • Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models, Wang et al. arXiv 2025

Search Efficiency and Dynamics

  • Information Directed Tree Search: Reasoning and Planning with Language Agents, Chandak et al. NeurIPS 2024
  • LiteSearch: Efficacious Tree Search for LLM, Wang et al. arXiv 2024
  • BoostStep: Boosting Mathematical Capability of Large Language Models via Improved Single-Step Reasoning, Zhang et al. arXiv 2025
  • Bilevel MCTS for Amortized O(1) Node Selection in Classical Planning, Asai et al. arXiv 2025
  • Skip a Layer or Loop It? Test-Time Depth Adaptation of Pretrained LLMs, Li et al. arXiv 2025
  • Time-Critical and Confidence-Based Abstraction Dropping Methods, Schmocker et al. arXiv 2025
  • Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search, Inoue et al. arXiv 2025

👏 Contributing

Contributions are highly encouraged!

If you have a relevant paper that complements this taxonomy, feel free to submit a pull request or reach out to the author directly.

Your support will help expand and improve this repository!

📖 Citation

If you find this project helpful in your research, please consider cite:

@article{wei2025unifying,
  title={Unifying Tree Search Algorithm and Reward Design for LLM Reasoning: A Survey},
  author={Wei, Jiaqi and Zhang, Xiang and Yang, Yuejin and Huang, Wenxuan and Cao, Juntai and Xu, Sheng and Zhuang, Xiang and Gao, Zhangyang and Abdul-Mageed, Muhammad and Lakshmanan, Laks VS and others},
  journal={arXiv preprint arXiv:2510.09988},
  year={2025}
}

🌟 Star History

Star History Chart

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •