SpinQuant
LLM quantization
superseded — cited as a baseline and beaten by newer methods
3 papers critique it · 8 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites SpinQuant as a baseline.
“QuaRot, SpinQuant, and ButterflyQuant do not engage with directly [the regime of per-head q_norm/RoPE compatibility failures]”
— Influence-Inspired Spectral Rotations for Extreme Low-Bit LLM Quantization“previous gradient-based optimization~spinquant cannot easily explore permutation invariance, as permutation creates symmetric local optima in a non-convex fashion.”
— Exploring Model Invariance with Discrete Search for Ultra-Low-Bit Quantization“Unlike other learnable methods (e.g., SpinQuant liu2024spinquant) that optimize over the full Stiefel manifold with high computational cost, our sparse parameterization guarantees orthogonality by construction, enabling stable and efficient optimization.”
— ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms
Beaten on benchmarks
Head-to-head results where a newer method reports beating SpinQuant. Values are copied from the source paper's tables — verify against the cited paper.
- HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · Wiki2 (Perplexity) [Llama-3-8B (W4A8)]
6.96 vs 7.48
- HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · GSM8K [Qwen2.5-7B (W4A8)]
86.12 vs 84.97
- HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · Wiki2 (Perplexity) [Llama-3-8B (W4A16)]
6.43 vs 6.46
- HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · GSM8K [Qwen-2.5-7B (W4A16)]
87.50 vs 87.10
- HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · Wiki2 (Perplexity) [Llama-3-8B (W3A16)]
8.02 vs 8.85
- HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · GSM8K [Llama-3-8B (W3A16)]
70.15 vs 66.10
- HeRo-Q: A General Framework for Stable Low Bit Quantization via Hessian Conditioning
HeRo-Q beats SpinQuant · Wiki2 (Perplexity) [Llama-3-8B (W4A4)]
10.29 vs 12.14
- InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
InfoQuant beats SpinQuant · 0-shot Avg. [4-4-16]
65.74 vs 64.11
- InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
InfoQuant beats SpinQuant · WikiText2 perplexity [4-4-16]
7.07 vs 7.28
- InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
InfoQuant beats SpinQuant · 0-shot Avg. [4-4-4]
65.57 vs 64.10
- InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization
InfoQuant beats SpinQuant · WikiText2 perplexity [4-4-4]
7.16 vs 7.35
- SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization
SpecQuant beats SpinQuant · 0-shot^9 [4-16-16]
66.88 vs 66.54
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Jun 5, 2026
- FAIR-CalibFAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language ModelsJun 4, 2026
- May 25, 2026
- May 11, 2026
- Activation Residual Hessian Quantization (ARHQ)Technical Report: Activation Residual Hessian Quantization (ARHQ) for Low-Bit LLM QuantizationApr 30, 2026
- Apr 20, 2026
- Apr 14, 2026
- Mar 26, 2026
- Jan 29, 2026
- Reasoning-QATWhat Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic StudyJan 21, 2026
- Dec 3, 2025
- adaptive transformation selection frameworkAdaptive Layer-Wise Transformations for Post-Training Quantization of Large Language ModelsNov 21, 2025