Method Drift›LLM reasoning / chain-of-thought
Superseded baseline#47 of 772 most-superseded
HRM
LLM reasoning / chain-of-thought
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 1 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites HRM as a baseline.
“SR² surpasses HRM by 11.6% on Sudoku-Extreme and 19.2% on Maze-Hard, while using only one eighth of HRM's parameters.”
— Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
Beaten on benchmarks
Head-to-head results where a newer method reports beating HRM. Values are copied from the source paper's tables — verify against the cited paper.
- Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
SR² beats HRM · Sudoku-Extreme [Sudoku-Extreme]
66.63 vs 55.0
- Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
SR² beats HRM · Maze-Hard [Maze-Hard]
93.7 vs 74.5
- Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
SR² beats HRM · ARC-1 [ARC-1]
44.3 vs 40.3
- Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
SR² beats HRM · ARC-2 [ARC-2]
6.7 vs 5.0
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.