CL LGJun 3, 2025

ProcrustesGPT: Compressing LLMs with Structured Matrices and Orthogonal Transformations

Ekaterina Grishina, Mikhail Gorbunov, Maxim Rakhuba

arXiv:2506.02818v14.91 citationsh-index: 6Has CodeACL

Originality Incremental advance

AI Analysis

This addresses the resource-intensive nature of LLMs for AI practitioners, offering a novel compression method that is incremental in leveraging existing structured matrix techniques.

The paper tackles the problem of compressing large language models (LLMs) to reduce computational and memory resources by using orthogonal transformations to improve weight compressibility with structured matrices, achieving significant parameter reduction without fine-tuning.

Large language models (LLMs) demonstrate impressive results in natural language processing tasks but require a significant amount of computational and memory resources. Structured matrix representations are a promising way for reducing the number of parameters of these models. However, it seems unrealistic to expect that weight matrices of pretrained models can be accurately represented by structured matrices without any fine-tuning. To overcome this issue, we utilize the fact that LLM output is invariant under certain orthogonal transformations of weight matrices. This insight can be leveraged to identify transformations that significantly improve the compressibility of weights within structured classes. The proposed approach is applicable to various types of structured matrices that support efficient projection operations. Code is available at https://github.com/GrishKate/ProcrustesGPT

View on arXiv PDF Code

Similar