Is ZeroQuant superseded?

Q: Is ZeroQuant superseded?

ZeroQuant (LLM quantization): superseded — cited as a baseline and beaten by newer methods. 3 paper(s) critique it, 0 beat it on benchmarks — #30 of 80 most-superseded. Sub-problem: cluster led by FlexRound.

Method Drift›LLM quantization

Superseded baseline#30 of 80 most-superseded

ZeroQuant

ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers

LLM quantization · first seen Jun 4, 2022

superseded — cited as a baseline and beaten by newer methods

3 papers critique it · 0 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites ZeroQuant as a baseline.

“ZeroQuant incurs severe accuracy degradation for an open-source LLM”
— LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
“ZeroQuant requires 3.1 hours on a single A100 GPU to quantize an LLM with 1.3 billion parameters.”
— SplitQuantV2: Enhancing Low-Bit Quantization of LLMs Without GPUs
“However, both LLM.int8() and ZeroQuant are not efficient for quantizing LLMs to extreme low-percision number formats such as 3-bit integers.”
— AdpQ: A Zero-shot Calibration Free Adaptive Post Training Quantization Method for LLMs