CLAIJun 4

Value-and-Structure Alignment for Routing-Consistent Quantization of Mixture-of-Experts Models

arXiv:2606.0568872.4
Predicted impact top 71% in CL · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners deploying large MoE models, VSRAQ enables more accurate quantization by maintaining routing consistency, addressing a critical bottleneck in MoE deployment.

MoE models suffer from routing instability under quantization, where small perturbations alter expert selection and degrade quality. VSRAQ preserves pre-quantization expert-selection behavior via value and structure alignment, reducing degradation without inference overhead and outperforming baselines on recent MoE models.

Mixture-of-Experts (MoE) models scale foundation models efficiently by activating only a subset of experts for each token, but their large number of expert parameters still makes quantization essential for practical deployment. Unlike dense models, however, MoE models are sensitive to routing instability: small quantization-induced perturbations can change the top-$k$ expert selection, altering the computation path and degrading model quality. We propose Value-and-Structure Routing Alignment for Quantization (VSRAQ), a MoE-specific post-training quantization objective that preserves pre-quantization expert-selection behavior under quantization. VSRAQ combines two complementary objectives that jointly preserve expert-selection behavior: value alignment, which matches routing-relevant logits or scores, and structure alignment, which preserves expert ordering and top-$k$ decision boundaries. By maintaining routing consistency, VSRAQ reduces quantization-induced degradation without introducing any inference-time overhead and can be integrated into existing quantization frameworks. Experiments on recent MoE foundation models show that VSRAQ improves expert-selection consistency and consistently outperforms reconstruction-only and router-aware baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes