Superseded baseline#20 of 80 most-superseded
QLoRA
QLoRA: Efficient Finetuning of Quantized LLMsLLM quantization · first seen May 23, 2023
superseded — cited as a baseline and beaten by newer methods
3 papers critique it · 1 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites QLoRA as a baseline.
“Limited to fine-tuning (not training from scratch) and requires GPU hardware.”
— True 4-Bit Quantized Convolutional Neural Network Training on CPU: Achieving Full-Precision Parity“However, these methods only apply quantization to the weight during fine-tuning to reduce memory consumption.”
— RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization“Although QLoRA substantially reduces GPU memory and fine-tuning time while maintaining strong performance at the 4-bit level, it suffers from considerable performance degradation below 4 bits.”
— Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
Beaten on benchmarks
Head-to-head results where a newer method reports beating QLoRA. Values are copied from the source paper's tables — verify against the cited paper.
- RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
RoLoRA beats QLoRA · Avg. (ZCSR) [W4A16 + RTN]
45.3 vs 36.2
- RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
RoLoRA beats QLoRA · Avg. (MMLU) [W4A16 + RTN]
24.5 vs 23.5
- RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
RoLoRA beats QLoRA · Avg. (ZCSR) [W4A4 + GPTQ]
58.2 vs 37.7
- RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization
RoLoRA beats QLoRA · Avg. (MMLU) [W4A4 + GPTQ]
29.9 vs 23.6
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- STaR-QuantSTaR-Quant: State-Time Consistent Post-Training Quantization for Diffusion Large Language ModelsJun 3, 2026
- May 26, 2026
- May 1, 2026
- Bit-by-BitBit-by-Bit: Progressive QAT Strategy with Outlier Channel Splitting for Stable Low-Bit LLMsApr 9, 2026
- Benford-QuantBenford's Law as a Distributional Prior for Post-Training Quantization of Large Language ModelsJan 29, 2026
- HestiaHESTIA: A Hessian-Guided Differentiable Quantization-Aware Training Framework for Extremely Low-Bit LLMsJan 28, 2026
- Layer-Wise High-Impact Parameter Ratio OptimizationLayer-Wise High-Impact Parameter Ratio Optimization in Post-Training Quantization for Large Language ModelsNov 21, 2025
- Sep 28, 2025