LGAIMay 17

The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

U of Toronto
arXiv:2601.0342584.43 citationsh-index: 11
AI Analysis

For researchers and practitioners using MoE models, this work reveals a structural bias toward centralized computation that undermines the expected specialization, suggesting current training objectives like load-balancing may be suboptimal.

The paper challenges the assumption that Mixture-of-Experts models achieve domain specialization via sparse routing, revealing a domain-invariant 'Standing Committee' of experts that captures most routing mass across domains, layers, and budgets, indicating specialization is less pervasive than believed.

Mixture of Experts models are widely assumed to achieve domain specialization through sparse routing. In this work, we question this assumption by introducing COMMITTEEAUDIT, a post hoc framework that analyzes routing behavior at the level of expert groups rather than individual experts. Across three representative models and the MMLU benchmark, we uncover a domain-invariant Standing Committee. This is a compact coalition of routed experts that consistently captures the majority of routing mass across domains, layers, and routing budgets, even when architectures already include shared experts. Qualitative analysis further shows that Standing Committees anchor reasoning structure and syntax, while peripheral experts handle domain-specific knowledge. These findings reveal a strong structural bias toward centralized computation, suggesting that specialization in Mixture of Experts models is far less pervasive than commonly believed. This inherent bias also indicates that current training objectives, such as load-balancing losses that enforce uniform expert utilization, may be working against the model's natural optimization path, thereby limiting training efficiency and performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes