PRIME (LLM reasoning / chain-of-thought): superseded — cited as a baseline and beaten by newer methods. 2 paper(s) critique it, 1 beat it on benchmarks — #19 of 772 most-superseded. Sub-problem: cluster led by MCTS. Newer alternatives in the same sub-problem include rePIRL, CoRD.

Method Drift›LLM reasoning / chain-of-thought

Superseded baseline#19 of 772 most-superseded

PRIME

LLM reasoning / chain-of-thought

superseded — cited as a baseline and beaten by newer methods

2 papers critique it · 1 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites PRIME as a baseline.

“this specific format only holds under certain assumptions”
— rePIRL: Learn PRM with Inverse RL for LLM Reasoning
“PRIME differs crucially by: [(a)] Employing per-token rewards derived from log-likelihood ratios, which reward-guided generation literatures (discrete GANs, human preference modeling, generation quality evaluation etc) suggests is much less effective than our holistic step-wise discriminators.”
— Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS

Beaten on benchmarks

Head-to-head results where a newer method reports beating PRIME. Values are copied from the source paper's tables — verify against the cited paper.

rePIRL beats PRIME · Math Avg. [Qwen2.5-3B-Instruct Math]
33.5 vs 29.9
rePIRL: Learn PRM with Inverse RL for LLM Reasoning
rePIRL beats PRIME · Coding Avg. [Qwen2.5-3B-Instruct Coding]
27.7 vs 25.2
rePIRL: Learn PRM with Inverse RL for LLM Reasoning
rePIRL beats PRIME · Math Avg. [Qwen3-4B-Base Math]
41.6 vs 39.3
rePIRL: Learn PRM with Inverse RL for LLM Reasoning
rePIRL beats PRIME · Coding Avg. [Qwen3-4B-Base Coding]
39.8 vs 31.1
rePIRL: Learn PRM with Inverse RL for LLM Reasoning

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.

rePIRL rePIRL: Learn PRM with Inverse RL for LLM Reasoning
May 19, 2026
CoRD Distilling Long-CoT Reasoning through Collaborative Step-wise Multi-Teacher Decoding
May 4, 2026