HRM (LLM reasoning / chain-of-thought): superseded — cited as a baseline and beaten by newer methods. 1 paper(s) critique it, 1 beat it on benchmarks — #47 of 772 most-superseded. Sub-problem: cluster led by ReAct. Newer alternatives in the same sub-problem include OLIVIA, Planner-centric Plan-Execute paradigm, SR^2, DS-STAR.

Method Drift›LLM reasoning / chain-of-thought

Superseded baseline#47 of 772 most-superseded

HRM

LLM reasoning / chain-of-thought

superseded — cited as a baseline and beaten by newer methods

1 papers critique it · 1 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites HRM as a baseline.

“SR² surpasses HRM by 11.6% on Sudoku-Extreme and 19.2% on Maze-Hard, while using only one eighth of HRM's parameters.”
— Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens

Beaten on benchmarks

Head-to-head results where a newer method reports beating HRM. Values are copied from the source paper's tables — verify against the cited paper.

SR² beats HRM · Sudoku-Extreme [Sudoku-Extreme]
66.63 vs 55.0
Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
SR² beats HRM · Maze-Hard [Maze-Hard]
93.7 vs 74.5
Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
SR² beats HRM · ARC-1 [ARC-1]
44.3 vs 40.3
Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens
SR² beats HRM · ARC-2 [ARC-2]
6.7 vs 5.0
Selection, Reflection and Self-Refinement: Revisit Reasoning Tasks via a Causal Lens

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.