Method Drift›Mixture-of-experts routing
MoEQuant
MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity GuidanceMixture-of-experts routing · first seen May 2, 2025
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 4 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites MoEQuant as a baseline.
“While MoEQuant~hu2025moequant attempts to alleviate this issue through expert-balanced self-sampling, such generative calibration approaches may compromise fair comparisons with other baselines.”
— EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization“MoEQuant moequant uses routing statistics to balance the contributions of each expert during calibration, but its performance remains unsatisfactory under quantization of $ 4$ bits.”
— KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models
Beaten on benchmarks
Head-to-head results where a newer method reports beating MoEQuant. Values are copied from the source paper's tables — verify against the cited paper.
- GEMQ: Global Expert-Level Mixed-Precision Quantization for MoE LLMs
GEMQ beats MoEQuant · 0-shot accuracy [3-16 bits (weight-activation)]
59.49 vs 57.24
- BitsMoE: Efficient Spectral Energy-Guided Bit Allocation for MoE LLM Quantization
BitsMoE beats MoEQuant · Avg. Accuracy [DeepSeek-V2-Lite, 2-bit]
41.04 vs 40.40
- BitsMoE: Efficient Spectral Energy-Guided Bit Allocation for MoE LLM Quantization
BitsMoE beats MoEQuant · Avg. Accuracy [Qwen3-30B-A3B-Base, 2-bit]
61.91 vs 52.21
- EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
EAQuant beats MoEQuant · WikiText2 Perplexity [DeepSeek-MoE-16B W4A16]
6.70 vs 6.78
- EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
EAQuant beats MoEQuant · C4 Perplexity [DeepSeek-MoE-16B W4A16]
9.18 vs 9.22
- EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
EAQuant beats MoEQuant · WikiText2 Perplexity [DeepSeek-MoE-16B W3A16]
7.27 vs 7.55
- EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
EAQuant beats MoEQuant · C4 Perplexity [DeepSeek-MoE-16B W3A16]
10.04 vs 10.88
- EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
EAQuant beats MoEQuant · WikiText2 Perplexity [Mixtral-8x7B W4A16]
4.03 vs 4.12
- EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
EAQuant beats MoEQuant · C4 Perplexity [Mixtral-8x7B W4A16]
7.10 vs 7.34
- EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
EAQuant beats MoEQuant · WikiText2 Perplexity [Mixtral-8x7B W3A16]
4.49 vs 4.90
- EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization
EAQuant beats MoEQuant · C4 Perplexity [Mixtral-8x7B W3A16]
7.47 vs 8.24
- KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models
KBVQ-MoE beats MoEQuant · Avg Acc [Qwen1.5-MoE-A2.7B, 2-bit]
62.78 vs 34.64
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 22, 2026
- May 21, 2026
- KBVQ-MoEKBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language ModelsJan 30, 2026
- Oct 13, 2025