Is PB-LLM superseded?

PB-LLM (LLM quantization): superseded — cited as a baseline and beaten by newer methods. 4 paper(s) critique it, 3 beat it on benchmarks — #11 of 80 most-superseded. Sub-problem: cluster led by GPTQ. Newer alternatives in the same sub-problem include QVGGT, LFQ, ADMM-Q, OSAQ, SEPTQ.

Method Drift›LLM quantization

Superseded baseline#11 of 80 most-superseded

PB-LLM

PB-LLM: Partially Binarized Large Language Models

LLM quantization · first seen Sep 29, 2023

superseded — cited as a baseline and beaten by newer methods

4 papers critique it · 3 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites PB-LLM as a baseline.

“However, it reveals the challenge of retaining a significant portion of weights, typically over 30%, at INT8 precision to maintain acceptable performance.”
— Bi-VLM: Pushing Ultra-Low Precision Post-Training Quantization Boundaries in Vision-Language Models
“both methods introduce additional unstructured fine-grained masks to distinguish salient weights which requires additional 1-bit per weight to store the mask and leads the memory of the quantized model to exceeding 2-bit per weight, where PB-LLM with 2.7-bit and BiLLM with 2.1-bit respectively”
— PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
“Even the recent binary PTQ method for LLMs, PB-LLM~shang2023pb, only maintains a perplexity metric of around 800 with an average weight of 1.7 bits.”
— BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
“PB-LLM~shang2023pb applies fixed saliency thresholds across layers”
— Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models

Beaten on benchmarks

Head-to-head results where a newer method reports beating PB-LLM. Values are copied from the source paper's tables — verify against the cited paper.

PTQ1.61 beats PB-LLM · Avg. [LLaMA 1-8]
41.14 vs 28.59
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 1-13]
46.56 vs 34.72
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 1-30]
51.77 vs 40.93
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 2-7]
39.20 vs 27.17
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 2-13]
44.72 vs 26.29
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 3-8]
37.20 vs 30.14
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [WikiText2 LLaMA 1-7]
12.50 vs 102.19
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [WikiText2 LLaMA 1-13]
9.67 vs 48.11
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [WikiText2 LLaMA 1-30]
7.95 vs 26.37
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [C4 LLaMA 1-7]
17.13 vs 67.92
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [C4 LLaMA 1-13]
13.51 vs 34.20
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
BiLLM beats PB-LLM · perplexity [LLaMA-7B mixed-bit]
35.04 vs 102.36
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.