Is QuaRot superseded?

QuaRot (LLM quantization): heavily superseded — a standard baseline that newer methods routinely beat. 7 paper(s) critique it, 12 beat it on benchmarks — #4 of 80 most-superseded. Sub-problem: cluster led by SmoothQuant. Newer alternatives in the same sub-problem include OffQ, FAIR-Calib, InfoQuant, ConQuR, Activation Residual Hessian Quantization (ARHQ).

Method Drift›LLM quantization

Heavily superseded#4 of 80 most-superseded

QuaRot

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

LLM quantization · first seen Mar 30, 2024

heavily superseded — a standard baseline that newer methods routinely beat

7 papers critique it · 12 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites QuaRot as a baseline.

“QuaRot reporting an accuracy loss of approximately 3.5 points”
— OSC: Hardware Efficient W4A4 Quantization via Outlier Separation in Channel Dimension
“Interestingly, we can prove analytically and show empirically that rotations improve MXFP4 accuracy, but hurt NVFP4 accuracy when coupled with standard Round-to-Nearest (RTN) quantization.”
— Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization
“Yet, these rotations introduce quadratic complexity, which offsets the potential acceleration.”
— ConvRot: Rotation-Based Plug-and-Play 4-bit Quantization for Diffusion Transformers
“QuaRot, SpinQuant, and ButterflyQuant do not engage with directly [the regime of per-head q_norm/RoPE compatibility failures]”
— Influence-Inspired Spectral Rotations for Extreme Low-Bit LLM Quantization
“However, these methods operate primarily along the feature dimension and ignore correlations across the sequence dimension.”
— STaMP: Sequence Transformation and Mixed Precision for Low-Precision Activation Quantization
“QuaRot fails on Qwen models smaller than 14B, suggesting that naive rotation alone is insufficient to suppress quantization error in the presence of severe outliers in small models”
— OffQ: Taming Structured Outliers in LLM Quantization by Offsetting
“However, these predetermined rotations cannot adapt to specific models.”
— ButterflyQuant: Ultra-low-bit LLM Quantization through Learnable Orthogonal Butterfly Transforms

Beaten on benchmarks

Head-to-head results where a newer method reports beating QuaRot. Values are copied from the source paper's tables — verify against the cited paper.

TesseraQ beats QuaRot · Avg. [W4A4]
65.12 vs 51.83
TesseraQ: Ultra Low-Bit LLM Post-Training Quantization with Block Reconstruction
QuantVSR beats QuaRot · PSNR [REDS4, W4A4]
23.31 vs 20.21
QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution
QuantVSR beats QuaRot · PSNR [SPMCS, W4A4]
22.76 vs 20.16
QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution
QuantVSR beats QuaRot · PSNR [MVSR4x, W4A4]
21.18 vs 21.00
QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution
STaR-Quant beats QuaRot · Avg. [W4A4 (4-bit weight and activation) on LLADA-8B]
57.07 vs 51.03
STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models
STaR-Quant beats QuaRot · Avg. [W4A4 (4-bit weight and activation) on LLADA-1.5-8B]
66.93 vs 61.06
STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models
STaR-Quant beats QuaRot · Avg. [W4A4 (4-bit weight and activation) on DREAM-7B]
63.59 vs 58.85
STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models
SpecQuant beats QuaRot · 0-shot^9 [4-4-16]
64.75 vs 61.69
SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization
SpecQuant beats QuaRot · 0-shot^9 [4-4-4]
64.75 vs 61.38
SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization
MicroRotated-GPTQ beats QuaRot · Avg [MXFP4 W4A4]
73.65 vs 62.90
Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization
Reasoning-QAT beats QuaRot · Avg. [Qwen3-0.6B W4A4KV4]
21.44 vs 4.84
What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
Reasoning-QAT beats QuaRot · Avg. [R1-1.5B W4A4KV4]
41.31 vs 2.11
What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.