LGJun 16, 2025

Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization

Badr AlKhamissi, C. Nicolò De Sabbata, Greta Tuckute, Zeming Chen, Martin Schrimpf, Antoine Bosselut

arXiv:2506.13331v213.06 citationsh-index: 21

Originality Incremental advance

AI Analysis

This work addresses the need for interpretable and controllable AI models, particularly for applications requiring human-like reasoning, though it is incremental in building on existing transformer and modular methods.

The paper tackled the problem of creating more human-like and interpretable AI by proposing MiCRo, a modular transformer architecture with brain-inspired specialized experts, which outperformed or matched baselines on reasoning benchmarks like GSM8K and BBH while enabling dynamic control over outputs.

Human cognitive behavior arises from the interaction of specialized brain networks dedicated to distinct functions, such as language, logic, and social reasoning. Inspired by this organization, we propose Mixture of Cognitive Reasoners (MiCRo): a modular, transformer-based architecture post-trained with a curriculum that induces functional specialization across experts. Concretely, we partition the layers of a pretrained language model into four expert modules aligned with well-studied cognitive networks in the human brain. MiCRo offers three key advantages over standard language models. (1) The specialized experts are interpretable and causally meaningful -- ablating a module causes substantial drops on benchmarks requiring its specialized domain. (2) MiCRo's behavior can be dynamically steered at inference time by routing tokens to particular experts (e.g., favoring social over logical reasoning), enabling fine-grained control over outputs. (3) MiCRo outperforms or matches comparable baselines on both machine-learning reasoning benchmarks (e.g., GSM8K, BBH) and alignment to human behavior (CogBench), while maintaining interpretability. Taken together, cognitively grounded functional specialization yields models that are both more human-like and more human-interpretable.

View on arXiv PDF

Similar