QTIP
QTIP: Quantization with Trellises and Incoherence ProcessingLLM quantization · first seen Jun 17, 2024
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 2 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites QTIP as a baseline.
“Although QTIP shows significant improvement over TCQ, it still suffers from high computational complexity.”
— CCQ: Convolutional Code for Extreme Low-bit Quantization in LLMs
Beaten on benchmarks
Head-to-head results where a newer method reports beating QTIP. Values are copied from the source paper's tables — verify against the cited paper.
- Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
GLVQ-8D beats QTIP · ARC-Challenge [2-bit quantization, Llama 2-13B]
40.0 vs 39.2
- Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
GLVQ-32D beats QTIP · Perplexity [2-bit, Llama 2-7B]
5.41 vs 5.91
- Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression
GLVQ-32D beats QTIP · Perplexity [2-bit, Llama 2-70B]
3.36 vs 3.78
- ICQuant: Index Coding enables Low-bit LLM Quantization
ICQuant^SK-5% beats QTIP · C4 [Llama2-7B, ctx. 4096, 4.3 bits]
6.70 vs 6.71
- ICQuant: Index Coding enables Low-bit LLM Quantization
ICQuant^SK-5% beats QTIP · Wiki2 [Llama2-13B, ctx. 4096, 4.3 bits]
4.61 vs 4.62
- ICQuant: Index Coding enables Low-bit LLM Quantization
ICQuant^SK-5% beats QTIP · C4 [Llama2-13B, ctx. 4096, 4.3 bits]
6.09 vs 6.10
- ICQuant: Index Coding enables Low-bit LLM Quantization
ICQuant^SK-5% beats QTIP · C4 [Llama2-13B, ctx. 4096, 3.3 bits]
6.26 vs 6.28
- ICQuant: Index Coding enables Low-bit LLM Quantization
ICQuant^SK-8.25% beats QTIP · Wiki2 [Llama2-7B, ctx. 4096, 2.4 bits]
6.35 vs 6.82
- ICQuant: Index Coding enables Low-bit LLM Quantization
ICQuant^SK-8.25% beats QTIP · C4 [Llama2-7B, ctx. 4096, 2.4 bits]
8.25 vs 8.96
- ICQuant: Index Coding enables Low-bit LLM Quantization
ICQuant^SK-8.25% beats QTIP · C4 [Llama2-13B, ctx. 4096, 2.4 bits]
7.25 vs 7.39
- ICQuant: Index Coding enables Low-bit LLM Quantization
ICQuant^SK-8.25% beats QTIP · Wiki2 [Llama2-70B, ctx. 4096, 2.4 bits]
3.86 vs 3.87
- ICQuant: Index Coding enables Low-bit LLM Quantization
ICQuant^SK-8.25% beats QTIP · C4 [Llama2-70B, ctx. 4096, 2.4 bits]
5.61 vs 5.70
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- STaR-QuantSTaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language ModelsJun 3, 2026
- May 26, 2026
- May 1, 2026
- Bit-by-BitBit-by-Bit: Progressive QAT Strategy with Outlier Channel Splitting for Stable Low-Bit LLMsApr 9, 2026
- Benford-QuantBenford's Law as a Distributional Prior for Post-Training Quantization of Large Language ModelsJan 29, 2026
- HestiaHESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMsJan 28, 2026
- Layer-Wise High-Impact Parameter Ratio OptimizationLayer-Wise High-Impact Parameter Ratio Optimization in Post-Training Quantization for Large Language ModelsNov 21, 2025
- Sep 28, 2025