Method Drift

Living systematic review

LLM quantization

Compressing LLM weights and activations to low bit-widths (4-bit and below) for cheaper inference, while preserving quality — outlier handling, rotation, and post-training quantization.

97 papers · 151 critique receipts · 1,451 benchmark results · updated Jun 18, 2026

Most-superseded baselines

Ranked by how many distinct papers critique or beat each method. These are the standard baselines newer work routinely measures against.

  1. 1
  2. 2
    AWQ· GPTQ
    AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

    9 papers critique it · 15 beat it on benchmarks

  3. 4
    QuaRot· SmoothQuant
    QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs

    7 papers critique it · 12 beat it on benchmarks

  4. 6
    RTN· RTN

    2 papers critique it · 10 beat it on benchmarks

  5. 7
    SpinQuant· SmoothQuant

    3 papers critique it · 8 beat it on benchmarks

  6. 8
    AQLM· GPTQ

    3 papers critique it · 5 beat it on benchmarks

  7. 9
    QuIP· GPTQ
    QuIP: 2-Bit Quantization of Large Language Models With Guarantees

    3 papers critique it · 4 beat it on benchmarks

  8. 10
    SVDQuant· SmoothQuant
    SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models

    3 papers critique it · 4 beat it on benchmarks

  9. 11
    PB-LLM· GPTQ
    PB-LLM: Partially Binarized Large Language Models

    4 papers critique it · 3 beat it on benchmarks

  10. 12
    FlatQuant· SmoothQuant
    FlatQuant: Flatness Matters for LLM Quantization

    2 papers critique it · 4 beat it on benchmarks

Sub-problems

Methods that compete on the same benchmarks cluster into distinct sub-problems.

GPTQ · 26 methods

GPTQ · AWQ · OmniQuant · AQLM · QuIP · PB-LLM

SmoothQuant · 33 methods

SmoothQuant · QuaRot · SpinQuant · SVDQuant · FlatQuant · AffineQuant

RTN · 30 methods

RTN · BitNet · QLoRA · EfficientQAT · QTIP · ARB-LLM

GPTAQ · 12 methods

GPTAQ · MBQ · DuQuant · MASQuant · QSLAW · QSVD

PACT · 10 methods

PACT · LSQ · N2UQ · GPLQ · LSQ+ · DiffQ

QDrop · 8 methods

QDrop · PD-Quant · FIMA-Q · AdaLog · EasyQuant · MGRQ

FlexRound · 6 methods

FlexRound · LLM.int8() · ZeroQuant · SplitQuantV2 · LRQ · AdpQ

AMQ · 8 methods

AMQ · SFMP · HAQ · HAWQ · MixLLM · BitStack

QServe · 3 methods

QServe · APEX4 · LiquidGEMM

The frontier

Recent methods not yet superseded in the knowledge base.