NeuReasoner: Towards Explainable, Controllable, and Unified Reasoning via Mixture-of-Neurons
This addresses reliability and efficiency issues in AI reasoning systems, offering a unified solution for researchers and practitioners, though it builds incrementally on existing neuron analysis methods.
The paper tackles persistent failure modes in Large Reasoning Models by proposing NeuReasoner, a framework that uses Mixture-of-Neurons analysis for explainable and controllable reasoning, achieving performance gains of up to 27.0% and reducing token consumption by 19.6% to 63.3% across benchmarks.
Large Reasoning Models (LRMs) have recently achieved remarkable success in complex reasoning tasks. However, closer scrutiny reveals persistent failure modes compromising performance and cost: I) Intra-step level, marked by calculation or derivation errors; II) Inter-step level, involving oscillation and stagnation; and III) Instance level, causing maladaptive over-thinking. Existing endeavors target isolated levels without unification, while their black-box nature and reliance on RL hinder explainability and controllability. To bridge these gaps, we conduct an in-depth white-box analysis, identifying key neurons (Mixture of Neurons, MoN) and their fluctuation patterns associated with distinct failures. Building upon these insights, we propose NeuReasoner, an explainable, controllable, and unified reasoning framework driven by MoN. Technically, NeuReasoner integrates lightweight MLPs for failure detection with a special token-triggered self-correction mechanism learned via SFT. During inference, special tokens are inserted upon failure detection to actuate controllable remedial behaviors. Extensive evaluations across six benchmarks, six backbone models (8B~70B) against nine competitive baselines, demonstrate that NeuReasoner achieves performance gains of up to 27.0% while reducing token consumption by 19.6% ~ 63.3%.