PB-LLM
PB-LLM: Partially Binarized Large Language ModelsLLM quantization · first seen Sep 29, 2023
superseded — cited as a baseline and beaten by newer methods
4 papers critique it · 3 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites PB-LLM as a baseline.
“However, it reveals the challenge of retaining a significant portion of weights, typically over 30%, at INT8 precision to maintain acceptable performance.”
— Bi-VLM: Pushing Ultra-Low Precision Post-Training Quantization Boundaries in Vision-Language Models“both methods introduce additional unstructured fine-grained masks to distinguish salient weights which requires additional 1-bit per weight to store the mask and leads the memory of the quantized model to exceeding 2-bit per weight, where PB-LLM with 2.7-bit and BiLLM with 2.1-bit respectively”
— PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models“Even the recent binary PTQ method for LLMs, PB-LLM~shang2023pb, only maintains a perplexity metric of around 800 with an average weight of 1.7 bits.”
— BiLLM: Pushing the Limit of Post-Training Quantization for LLMs“PB-LLM~shang2023pb applies fixed saliency thresholds across layers”
— Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models
Beaten on benchmarks
Head-to-head results where a newer method reports beating PB-LLM. Values are copied from the source paper's tables — verify against the cited paper.
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 1-8]
41.14 vs 28.59
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 1-13]
46.56 vs 34.72
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 1-30]
51.77 vs 40.93
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 2-7]
39.20 vs 27.17
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 2-13]
44.72 vs 26.29
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · Avg. [LLaMA 3-8]
37.20 vs 30.14
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [WikiText2 LLaMA 1-7]
12.50 vs 102.19
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [WikiText2 LLaMA 1-13]
9.67 vs 48.11
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [WikiText2 LLaMA 1-30]
7.95 vs 26.37
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [C4 LLaMA 1-7]
17.13 vs 67.92
- PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models
PTQ1.61 beats PB-LLM · PPL [C4 LLaMA 1-13]
13.51 vs 34.20
- BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
BiLLM beats PB-LLM · perplexity [LLaMA-7B mixed-bit]
35.04 vs 102.36
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 29, 2026
- LFQLFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMsMay 28, 2026
- ADMM-QADMM-Q: An Improved Hessian-based Weight Quantizer for Post-Training Quantization of Large Language ModelsMay 11, 2026
- May 6, 2026
- Apr 11, 2026
- Jan 21, 2026
- Grouped Lattice Vector Quantization (GLVQ)Learning Grouped Lattice Vector Quantizers for Low-Bit LLM CompressionOct 23, 2025
- Sep 28, 2025
- Bi-VLMBi-VLM: Pushing Ultra-Low Precision Post-Training Quantization Boundaries in Vision-Language ModelsSep 23, 2025
- Sep 18, 2025