RTN
LLM quantization
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 10 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites RTN as a baseline.
“the quantization error increases significantly when the number of bits is small, especially when significant outliers exist.”
— CCQ: Convolutional Code for Extreme Low-bit Quantization in LLMs“uniform grids spend disproportionate capacity on rare large magnitudes while under-resolving dense near-zero regions; the mismatch is exacerbated in layers whose weight magnitudes span multiple decades.”
— Benford's Law as a Distributional Prior for Post-Training Quantization of Large Language Models
Beaten on benchmarks
Head-to-head results where a newer method reports beating RTN. Values are copied from the source paper's tables — verify against the cited paper.
- STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models
STaR-Quant beats RTN · Avg. [W4A4 (4-bit weight and activation) on LLADA-8B]
57.07 vs 44.23
- STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models
STaR-Quant beats RTN · Avg. [W4A4 (4-bit weight and activation) on LLADA-1.5-8B]
66.93 vs 53.12
- STaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language Models
STaR-Quant beats RTN · Avg. [W4A4 (4-bit weight and activation) on DREAM-7B]
63.59 vs 49.53
- Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
ApiQ beats RTN · WikiText2 PPL [LLaMA-2-7B, 3-bit]
5.77 vs 6.66
- Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
ApiQ beats RTN · C4 PPL [LLaMA-2-7B, 3-bit]
7.48 vs 8.40
- Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
ApiQ beats RTN · WikiText2 PPL
5.12 vs 5.51
- Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
ApiQ beats RTN · C4 PPL
6.83 vs 7.18
- Benford's Law as a Distributional Prior for Post-Training Quantization of Large Language Models
Benford-Quant beats RTN · Perplexity [3 bits, Small models]
70.87 vs 755.19
- Benford's Law as a Distributional Prior for Post-Training Quantization of Large Language Models
Benford-Quant beats RTN · Perplexity [4 bits, Small models]
32.28 vs 38.91
- Benford's Law as a Distributional Prior for Post-Training Quantization of Large Language Models
Benford-Quant beats RTN · Perplexity [4 bits, Large models]
7.02 vs 17.22
- What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
Reasoning-QAT beats RTN · Avg. [Qwen3-0.6B W3G128]
31.67 vs 5.07
- What Makes Low-Bit Quantization-Aware Training Work for Reasoning LLMs? A Systematic Study
Reasoning-QAT beats RTN · Avg. [R1-Qwen-1.5B W3G128]
43.26 vs 10.08
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- STaR-QuantSTaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language ModelsJun 3, 2026
- May 26, 2026
- May 1, 2026
- Bit-by-BitBit-by-Bit: Progressive QAT Strategy with Outlier Channel Splitting for Stable Low-Bit LLMsApr 9, 2026
- Benford-QuantBenford's Law as a Distributional Prior for Post-Training Quantization of Large Language ModelsJan 29, 2026
- HestiaHESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMsJan 28, 2026
- Layer-Wise High-Impact Parameter Ratio OptimizationLayer-Wise High-Impact Parameter Ratio Optimization in Post-Training Quantization for Large Language ModelsNov 21, 2025
- Sep 28, 2025