CLMar 16, 2025

SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression

arXiv:2503.12340v155 citationsh-index: 16Has CodeNAACL
Originality Incremental advance
AI Analysis

This work addresses the deployment challenge of LLMs due to their large sizes, offering a domain-specific compression technique that is incremental in nature.

The paper tackles the problem of compressing Large Language Models (LLMs) by optimizing singular value truncation in SVD-based compression, resulting in improved performance over state-of-the-art methods on ten datasets and five LLMs.

Despite significant advancements, the practical deployment of Large Language Models (LLMs) is often hampered by their immense sizes, highlighting the need for effective compression techniques. Singular Value Decomposition (SVD) is a promising LLM compression technique. However, existing SVD-based compression methods fall short in reducing truncation losses, leading to less competitive performance in compressed models. In this work, we introduce SVD-LLM V2, a SVD-based LLM compression method that optimizes singular value truncation in SVD compression with two techniques. First, SVD-LLM V2 proposes to use theoretical truncation loss of weight matrices to assign a unique compression ratio to each weight matrix at different layers to accommodate weight redundancy heterogeneity. Second, SVD-LLM V2 proposes loss-optimized weight truncation to ensure that the truncated singular values result in a lower and more stable truncation loss in practice. We evaluate SVD-LLM V2 on ten datasets and five LLMs at various scales. Our results show SVD-LLM V2 outperforms state-of-the-art SVD-based LLM compression methods. Our code is available at https://github.com/AIoT-MLSys-Lab/SVD-LLM

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes