LGAIMay 23, 2025

Generalized Fisher-Weighted SVD: Scalable Kronecker-Factored Fisher Approximation for Compressing Large Language Models

arXiv:2505.17974v14 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient LLM compression for deployment, offering a more accurate method than diagonal approximations, though it is incremental as it builds on existing Fisher-based compression techniques.

The paper tackles the problem of compressing large language models (LLMs) by proposing Generalized Fisher-Weighted SVD (GFWSVD), a post-training compression technique that accounts for both diagonal and off-diagonal elements of the Fisher information matrix to better reflect parameter importance, resulting in improvements such as outperforming baselines by 3-6 percent at a 20 compression rate on the MMLU benchmark.

The Fisher information is a fundamental concept for characterizing the sensitivity of parameters in neural networks. However, leveraging the full observed Fisher information is too expensive for large models, so most methods rely on simple diagonal approximations. While efficient, this approach ignores parameter correlations, often resulting in reduced performance on downstream tasks. In this work, we mitigate these limitations and propose Generalized Fisher-Weighted SVD (GFWSVD), a post-training LLM compression technique that accounts for both diagonal and off-diagonal elements of the Fisher information matrix, providing a more accurate reflection of parameter importance. To make the method tractable, we introduce a scalable adaptation of the Kronecker-factored approximation algorithm for the observed Fisher information. We demonstrate the effectiveness of our method on LLM compression, showing improvements over existing compression baselines. For example, at a 20 compression rate on the MMLU benchmark, our method outperforms FWSVD, which is based on a diagonal approximation of the Fisher information, by 5 percent, SVD-LLM by 3 percent, and ASVD by 6 percent compression rate.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes