LGMLNov 26, 2025

Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixing

arXiv:2512.03062v13 citations
Originality Incremental advance
AI Analysis

This work addresses computational resource demands for LLMs, offering incremental improvements to SVD compression methods.

The paper tackles the problem of compressing Large Language Models (LLMs) via Singular Value Decomposition (SVD) by introducing two physics-inspired improvements: FermiGrad for globally optimal rank selection and PivGa for lossless compression of low-rank factors, resulting in enhanced compression efficiency.

Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid of its parameter redundancy. In this work, we present two physics-inspired improvements to SVD LLM compression: (1) \textbf{FermiGrad}, a gradient-descent algorithm that determines globally optimal layer-wise ranks by relaxing the discrete singular-value truncation into a continuous optimization using the Fermi function; (2) \textbf{PivGa}, an additional \textit{lossless} compression of the low-rank factors that exploits the intrinsic gauge freedom in their parametrization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes