Is SpinQuant superseded? Critiques, benchmarks & alternatives

Q: Is SpinQuant superseded?

SpinQuant (LLM quantization): superseded — cited as a baseline and beaten by newer methods. 3 paper(s) critique it, 8 beat it on benchmarks — #7 of 80 most-superseded. Sub-problem: cluster led by SmoothQuant. Newer alternatives in the same sub-problem include OffQ, FAIR-Calib, InfoQuant, ConQuR, Activation Residual Hessian Quantization (ARHQ).

What papers say

Verbatim critique sentences, each from a paper that cites SpinQuant as a baseline.

“QuaRot, SpinQuant, and ButterflyQuant do not engage with directly [the regime of per-head q_norm/RoPE compatibility failures]”
— Influence-Inspired Spectral Rotations for Extreme Low-Bit LLM Quantization
“previous gradient-based optimization~spinquant cannot easily explore permutation invariance, as permutation creates symmetric local optima in a non-convex fashion.”
— Exploring Model Invariance with Discrete Search for Ultra-Low-Bit Quantization
“Unlike other learnable methods (e.g., SpinQuant liu2024spinquant) that optimize over the full Stiefel manifold with high computational cost, our sparse parameterization guarantees orthogonality by construction, enabling stable and efficient optimization.”
— ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms

Beaten on benchmarks

Head-to-head results where a newer method reports beating SpinQuant. Values are copied from the source paper's tables — verify against the cited paper.

HeRo-Q beats SpinQuant · Wiki2 (Perplexity) [Llama-3-8B (W4A8)]
6.96 vs 7.48
HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · GSM8K [Qwen2.5-7B (W4A8)]
86.12 vs 84.97
HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · Wiki2 (Perplexity) [Llama-3-8B (W4A16)]
6.43 vs 6.46
HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · GSM8K [Qwen-2.5-7B (W4A16)]
87.50 vs 87.10
HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · Wiki2 (Perplexity) [Llama-3-8B (W3A16)]
8.02 vs 8.85
HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · GSM8K [Llama-3-8B (W3A16)]
70.15 vs 66.10
HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · Wiki2 (Perplexity) [Llama-3-8B (W4A4)]
10.29 vs 12.14
HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
InfoQuant beats SpinQuant · 0-shot Avg. [4-4-16]
65.74 vs 64.11
InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
InfoQuant beats SpinQuant · WikiText2 perplexity [4-4-16]
7.07 vs 7.28
InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
InfoQuant beats SpinQuant · 0-shot Avg. [4-4-4]
65.57 vs 64.10
InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
InfoQuant beats SpinQuant · WikiText2 perplexity [4-4-4]
7.16 vs 7.35
InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
SpecQuant beats SpinQuant · 0-shot^9 [4-16-16]
66.88 vs 66.54
SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.