LGFeb 9

OJBKQ: Objective-Joint Babai-Klein Quantization

arXiv:2602.08376v1h-index: 3
Originality Incremental advance
AI Analysis

This addresses the need for efficient compression of large language models without retraining, representing an incremental improvement over existing weight-only quantization methods.

The paper tackled the problem of noticeable degradation in low-bit post-training quantization for large language models by introducing OJBKQ, a layer-wise method that formulates weight quantization as a joint optimization problem, achieving lower perplexity at 3-4 bits compared to existing approaches.

Post-training quantization (PTQ) is widely used to compress large language models without retraining. However, many existing weight-only methods rely on heuristic objectives and greedy rounding, thus leading to noticeable degradation under low-bit quantization. In this work, we introduce OJBKQ (Objective-Joint Babai-Klein Quantization with K-Best Sampling), a layer-wise PTQ method that formulates weight quantization as a joint optimization problem over activations and weights. This formulation results in a multiple-right-hand-side box-constrained integer least squares (BILS) problem in each layer, which is NP-hard. For each column of the weight matrix, we apply an extended Babai nearest-plane algorithm and an extended version of Klein's randomized Babai algorithm to find the minimum-residual Babai-Klein point, a sub-optimal solution to the BILS problem. Experimental results on large language models show that OJBKQ achieves lower perplexity at 3-4 bits compared to existing PTQ approaches, while maintaining comparable computational cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes