Is EurusPRM-Stage1 superseded?

EurusPRM-Stage1 (LLM reasoning / chain-of-thought): superseded — cited as a baseline and beaten by newer methods. 0 paper(s) critique it, 2 beat it on benchmarks — #25 of 772 most-superseded. Sub-problem: cluster led by ORM. Newer alternatives in the same sub-problem include SCI-PRM, GR-Ben, MedPRMBench, DC-W2S, CoTZero.

Method Drift›LLM reasoning / chain-of-thought

Superseded baseline#25 of 772 most-superseded

EurusPRM-Stage1

LLM reasoning / chain-of-thought

superseded — cited as a baseline and beaten by newer methods

0 papers critique it · 2 beat it on benchmarks

Beaten on benchmarks

Head-to-head results where a newer method reports beating EurusPRM-Stage1. Values are copied from the source paper's tables — verify against the cited paper.

GroundedPRM beats EurusPRM-Stage1 · F1 score [auto-labeled supervision at 40K samples]
39.7 vs 31.2
GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning
GenPRM-7B (Maj@8) beats EurusPRM-Stage1 · Avg. [PRMs (7-8B)]
80.5 vs 31.2
GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.

SCI-PRM SCI-PRM: A Tool Aware Process Reward Model for Scientific Reasoning Verification
Jun 3, 2026
GR-Ben GR-Ben: A General Reasoning Benchmark for Evaluating Process Reward Models
May 2, 2026
MedPRMBench MedPRMBench: A Fine-grained Benchmark for Process Reward Models in Medical Reasoning
Apr 19, 2026
DC-W2S DC-W2S: Dual-Consensus Weak-to-Strong Training for Reliable Process Reward Modeling in Biological Reasoning
Mar 9, 2026
CoTZero CoTZero: Annotation-Free Human-Like Vision Reasoning via Hierarchical Synthetic CoT
Feb 9, 2026
FunPRM FunPRM: Function-as-Step Process Reward Model with Meta Reward Correction for Code Generation
Jan 29, 2026
Noise-Aware Iterative Training (NAIT)Towards Robust Process Reward Modeling via Noise-aware Learning
Jan 19, 2026
GroundedPRM GroundedPRM: Tree-Guided and Fidelity-Aware Process Reward Modeling for Step-Level Reasoning
Oct 16, 2025
group-relative advantage reinforcement learning Boosting Process-Correct CoT Reasoning by Modeling Solvability of Multiple-Choice QA
Sep 30, 2025