Is FlexRound superseded?

Q: Is FlexRound superseded?

FlexRound (LLM quantization): superseded — cited as a baseline and beaten by newer methods. 1 paper(s) critique it, 2 beat it on benchmarks — #23 of 80 most-superseded. Sub-problem: cluster led by FlexRound.

Method Drift›LLM quantization

Superseded baseline#23 of 80 most-superseded

FlexRound

FlexRound: Learnable Rounding based on Element-wise Division for Post-Training Quantization

LLM quantization · first seen Jun 1, 2023

superseded — cited as a baseline and beaten by newer methods

1 papers critique it · 2 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites FlexRound as a baseline.

“FlexRound incurs considerable performance degradation on the massive multitask language understanding (MMLU) benchmark”
— LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices

Beaten on benchmarks

Head-to-head results where a newer method reports beating FlexRound. Values are copied from the source paper's tables — verify against the cited paper.

FlexRound+LFQ (Ours) beats FlexRound · IFEval (greedy) [Qwen2.5-7B, W4]
71.35 vs 69.50
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
FlexRound+LFQ (Ours) beats FlexRound · MATH500 (greedy) [Qwen2.5-7B, W4]
73.4 vs 72.6
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
FlexRound+LFQ (Ours) beats FlexRound · IFEval (greedy) [Qwen2.5-7B, W3g128]
67.84 vs 66.54
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
FlexRound+LFQ (Ours) beats FlexRound · MATH500 (greedy) [Qwen2.5-7B, W3g128]
68.0 vs 65.6
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
FlexRound+LFQ (Ours) beats FlexRound · IFEval (greedy) [Qwen2.5-14B, W4]
78.00 vs 77.82
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
FlexRound+LFQ (Ours) beats FlexRound · MATH500 (greedy) [Qwen2.5-14B, W4]
77.2 vs 76.4
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
FlexRound+LFQ (Ours) beats FlexRound · IFEval (greedy) [Qwen2.5-14B, W3g128]
77.08 vs 75.05
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
FlexRound+LFQ (Ours) beats FlexRound · MATH500 (greedy) [Qwen2.5-14B, W3g128]
71.6 vs 69.6
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
FlexRound+LFQ beats FlexRound · IFEval (greedy) [Llama 3.1 8B, W4, FlexRound]
72.09 vs 70.24
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
FlexRound+LFQ beats FlexRound · GSM8K (greedy) [Llama 3.1 8B, W4, FlexRound]
81.80 vs 81.35
LFQ: Logit-aware Final-block Quantization for Boosting the Generation Quality of Low-Bit Quantized LLMs
LRQ (Ours) beats FlexRound · MMLU Average Accuracy [Llama 2 7B, 4/8/8]
45.36 vs 45.14
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices
LRQ (Ours) beats FlexRound · MMLU Average Accuracy [Llama 2 13B, 4/8/8]
54.49 vs 53.77
LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices