Superseded baseline#26 of 80 most-superseded
AffineQuant
AffineQuant: Affine Transformation Quantization for Large Language ModelsLLM quantization · first seen Mar 19, 2024
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 1 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites AffineQuant as a baseline.
“However, due to the significant overhead of full-size matrix multiplication, AffineQuant can only apply affine transformations to a small fraction of linear layers.”
— InfoQuant: Shaping Activation Distributions for Low-Bit LLM Quantization“While affine transformations theoretically offer greater flexibility than rotation transformations for handling outliers, the original AffineQuant approach has practical limitations. It learns a full transformation matrix that can only be applied to output projection layers for weight-activation quantization, where it merges with preceding linear layers to avoid overhead. Other layers must use per-channel scaling, limiting the method's broader applicability across model architectures.”
— Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
Beaten on benchmarks
Head-to-head results where a newer method reports beating AffineQuant. Values are copied from the source paper's tables — verify against the cited paper.
- Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers
AESPA beats AffineQuant · Accuracy [INT2, zero-shot, LLaMA-13B Average]
46.91 vs 43.51
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Jun 5, 2026
- FAIR-CalibFAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language ModelsJun 4, 2026
- May 25, 2026
- May 11, 2026
- Activation Residual Hessian Quantization (ARHQ)Technical Report: Activation Residual Hessian Quantization (ARHQ) for Low-Bit LLM QuantizationApr 30, 2026
- Apr 20, 2026
- Apr 14, 2026
- Mar 26, 2026
- Jan 29, 2026
- Reasoning-QATWhat Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic StudyJan 21, 2026
- Dec 3, 2025
- adaptive transformation selection frameworkAdaptive Layer-Wise Transformations for Post-Training Quantization of Large Language ModelsNov 21, 2025