LGAICLFeb 15, 2025

Probing Semantic Routing in Large Mixture-of-Expert Models

arXiv:2502.10928v27 citationsh-index: 5EMNLP
Originality Incremental advance
AI Analysis

This addresses the problem of understanding functional differentiation in large MoE models for AI researchers, but it is incremental as it builds on prior work exploring routing behavior.

The study investigated whether expert routing in large mixture-of-expert models is influenced by input semantics, finding statistically significant evidence of semantic routing through controlled experiments comparing expert overlap across conditions.

In the past year, large (>100B parameter) mixture-of-expert (MoE) models have become increasingly common in the open domain. While their advantages are often framed in terms of efficiency, prior work has also explored functional differentiation through routing behavior. We investigate whether expert routing in large MoE models is influenced by the semantics of the inputs. To test this, we design two controlled experiments. First, we compare activations on sentence pairs with a shared target word used in the same or different senses. Second, we fix context and substitute the target word with semantically similar or dissimilar alternatives. Comparing expert overlap across these conditions reveals clear, statistically significant evidence of semantic routing in large MoE models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes