LG AIApr 23, 2025

BackSlash: Rate Constrained Optimized Training of Large Language Models

arXiv:2504.16968v37.11 citationsh-index: 14ICML

Originality Highly original

AI Analysis

This addresses the need for efficient training and deployment of LLMs, offering a novel compression approach that is versatile for edge devices and incremental in applying rate-distortion optimization to training.

The paper tackles the problem of compressing large language models during training rather than after, introducing Rate-Constrained Training (BackSlash) to reduce memory usage by 60%-90% without accuracy loss and improve robustness to pruning up to 80%.

The rapid advancement of large-language models (LLMs) has driven extensive research into parameter compression after training has been completed, yet compression during the training phase remains largely unexplored. In this work, we introduce Rate-Constrained Training (BackSlash), a novel training-time compression approach based on rate-distortion optimization (RDO). BackSlash enables a flexible trade-off between model accuracy and complexity, significantly reducing parameter redundancy while preserving performance. Experiments in various architectures and tasks demonstrate that BackSlash can reduce memory usage by 60% - 90% without accuracy loss and provides significant compression gain compared to compression after training. Moreover, BackSlash proves to be highly versatile: it enhances generalization with small Lagrange multipliers, improves model robustness to pruning (maintaining accuracy even at 80% pruning rates), and enables network simplification for accelerated inference on edge devices.

View on arXiv PDF

Similar