Method Drift›LLM reasoning / chain-of-thought
Superseded baseline#28 of 772 most-superseded
Qwen2.5-7B-Instruct
LLM reasoning / chain-of-thought
superseded — cited as a baseline and beaten by newer methods
0 papers critique it · 2 beat it on benchmarks
Beaten on benchmarks
Head-to-head results where a newer method reports beating Qwen2.5-7B-Instruct. Values are copied from the source paper's tables — verify against the cited paper.
- Your Reward Function for RL is Your Best PRM for Search: Unifying RL and Search-Based TTS
AIRL-S beats Qwen2.5-7B-Instruct · Leetcode [Coding]
54.4 vs 47.4
- Mitigating Deceptive Alignment via Self-Monitoring
Self-Monitor-7B beats Qwen2.5-7B-Instruct · ASR [Qwen2.5-7B]
0.050 vs 0.740
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 19, 2026
- May 4, 2026