FlatQuant
FlatQuant: Flatness Matters for LLM QuantizationLLM quantization · first seen Oct 12, 2024
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 4 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites FlatQuant as a baseline.
“FlatQuant flatquant demonstrates a loss of only 1.4 points”
— OSC: Hardware Efficient W4A4 Quantization via Outlier Separation in Channel Dimension“Even though these approaches successfully quantize LLMs to 4 bits with slight performance degradation, they apply the same type of transformation across all layers, ignoring the distribution characteristics of each layer within LLMs.”
— Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
Beaten on benchmarks
Head-to-head results where a newer method reports beating FlatQuant. Values are copied from the source paper's tables — verify against the cited paper.
- What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
Reasoning-QAT beats FlatQuant · Avg. [Qwen3-0.6B W4A4KV4]
21.44 vs 17.34
- What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
Reasoning-QAT beats FlatQuant · Avg. [R1-1.5B W4A4KV4]
41.31 vs 38.39
- What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
Reasoning-QAT beats FlatQuant · Avg. [Qwen3-4B W4A4KV4]
60.78 vs 58.28
- Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
adaptive transformation selection framework beats FlatQuant · WikiText-2 PPL [W4A4KV4]
5.61 vs 5.78
- Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
adaptive transformation selection framework beats FlatQuant · C4 PPL [W4A4KV4]
7.58 vs 7.86
- Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
adaptive transformation selection framework beats FlatQuant · WikiText-2 PPL [W3A3K3V3]
7.22 vs 7.54
- Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
adaptive transformation selection framework beats FlatQuant · C4 PPL [W3A3K3V3]
9.43 vs 9.76
- Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
adaptive transformation selection framework beats FlatQuant · WikiText-2 PPL [W4A4K2V2]
6.98 vs 7.51
- Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
adaptive transformation selection framework beats FlatQuant · C4 PPL [W4A4K2V2]
9.04 vs 9.87
- Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
adaptive transformation selection framework beats FlatQuant · WikiText-2 PPL [W3A3K2V2]
9.83 vs 11.51
- Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
adaptive transformation selection framework beats FlatQuant · C4 PPL [W3A3K2V2]
13.35 vs 15.89
- Adaptive Layer-Wise Transformations for Post-Training Quantization of Large Language Models
adaptive transformation selection framework beats FlatQuant · Avg [W4A4KV4]
67.88 vs 67.47
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Jun 5, 2026
- FAIR-CalibFAIR-Calib: Frontier-Aware Instability-Reweighted Calibration for Post-Training Quantization of Diffusion Large Language ModelsJun 4, 2026
- May 25, 2026
- May 11, 2026
- Activation Residual Hessian Quantization (ARHQ)Technical Report: Activation Residual Hessian Quantization (ARHQ) for Low-Bit LLM QuantizationApr 30, 2026
- Apr 20, 2026
- Apr 14, 2026
- Mar 26, 2026
- Jan 29, 2026
- Reasoning-QATWhat Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic StudyJan 21, 2026
- Dec 3, 2025
- adaptive transformation selection frameworkAdaptive Layer-Wise Transformations for Post-Training Quantization of Large Language ModelsNov 21, 2025