Is Hessian superseded?

Hessian (Mixture-of-experts routing): superseded — cited as a baseline and beaten by newer methods. 0 paper(s) critique it, 2 beat it on benchmarks — #81 of 1370 most-superseded. Sub-problem: cluster led by MoEQuant. Newer alternatives in the same sub-problem include BitsMoE, GEMQ, KBVQ-MoE, MC# (Mixture-Compressor-sharp).

Method Drift›Mixture-of-experts routing

Superseded baseline#81 of 1,370 most-superseded

Hessian

Mixture-of-experts routing

superseded — cited as a baseline and beaten by newer methods

0 papers critique it · 2 beat it on benchmarks

Beaten on benchmarks

Head-to-head results where a newer method reports beating Hessian. Values are copied from the source paper's tables — verify against the cited paper.

Router norm + Max var (Ours) beats Hessian · Avg. [Mixtral 8x7B, 2.5 bits/expert]
68.38 vs 67.18
Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees
PMQ beats Hessian · Avg. (%) [Mixtral 8×7b at 2.54-bit]
67.50 vs 67.18
MC#: Mixture Compressor for Mixture-of-Experts Large Models
PMQ beats Hessian · Avg. (%) [DeepSeek-VL2-L at 2.57-bit]
70.60 vs 67.79
MC#: Mixture Compressor for Mixture-of-Experts Large Models
PMQ beats Hessian · Avg. (%) [DeepSeek-VL2-S at 2.58-bit]
63.66 vs 61.79
MC#: Mixture Compressor for Mixture-of-Experts Large Models

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.