Method Drift›LLM reasoning / chain-of-thought
MCTS
LLM reasoning / chain-of-thought
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 2 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites MCTS as a baseline.
“the MCTS process is time-consuming and computationally expensive, especially when calculating token-level rewards.”
— rePIRL: Learn PRM with Inverse RL for LLM Reasoning“the search space grows exponentially with trajectory length”
— Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding
Beaten on benchmarks
Head-to-head results where a newer method reports beating MCTS. Values are copied from the source paper's tables — verify against the cited paper.
- rePIRL: Learn PRM with Inverse RL for LLM Reasoning
rePIRL beats MCTS · Math Avg. [Qwen2.5-3B-Instruct Math]
33.5 vs 30.7
- rePIRL: Learn PRM with Inverse RL for LLM Reasoning
rePIRL beats MCTS · Coding Avg. [Qwen2.5-3B-Instruct Coding]
27.7 vs 25.7
- rePIRL: Learn PRM with Inverse RL for LLM Reasoning
rePIRL beats MCTS · Math Avg. [Qwen3-4B-Base Math]
41.6 vs 36.1
- rePIRL: Learn PRM with Inverse RL for LLM Reasoning
rePIRL beats MCTS · Coding Avg. [Qwen3-4B-Base Coding]
39.8 vs 36.4
- MC-NEST: Enhancing Mathematical Reasoning in Large Language Models leveraging a Monte Carlo Self-Refine Tree
MC-NEST beats MCTS · Solved Problems [GPT-4o on AIME]
58 vs 49
- MC-NEST: Enhancing Mathematical Reasoning in Large Language Models leveraging a Monte Carlo Self-Refine Tree
MC-NEST beats MCTS · Solved Problems [GPT-4o on MathOdyssey]
20 vs 16
- MC-NEST: Enhancing Mathematical Reasoning in Large Language Models leveraging a Monte Carlo Self-Refine Tree
MC-NEST beats MCTS · Pass@1 [AIME - GPT-4o]
38.6 vs 32.6
- MC-NEST: Enhancing Mathematical Reasoning in Large Language Models leveraging a Monte Carlo Self-Refine Tree
MC-NEST beats MCTS · Pass@1 [MathOdyssey - GPT-4o]
13.3 vs 10.6
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 19, 2026
- May 4, 2026