CLFeb 15, 2025

1bit-Merging: Dynamic Quantized Merging for Large Language Models

arXiv:2502.10743v25 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses the storage challenges in deploying multiple specialized models for practitioners, though it is incremental as it builds on existing task-specific routing methods.

The paper tackles the problem of efficiently merging specialized large language models without compromising task-specific performance or incurring high storage costs, by introducing 1bit-Merging, which uses 1-bit quantized task vectors and targeted compression to achieve comparable or superior performance while significantly reducing storage requirements.

Recent advances in large language models have led to specialized models excelling in specific domains, creating a need for efficient model merging techniques. While traditional merging approaches combine parameters into a single static model, they often compromise task-specific performance. However, task-specific routing methods maintain accuracy but introduce substantial storage overhead. We present \texttt{1bit}-Merging, a novel framework that integrates task-specific routing with 1-bit quantized task vectors to balance performance and storage efficiency. Our approach leverages the observation that different task-specific models store knowledge in distinct layers-chat models primarily in attention layers and math/code models in MLP layers, enabling targeted compression strategies. Through extensive experiments with LLaMA2 and Mistral model families across chat, mathematical reasoning, and code generation tasks, we demonstrate that 1bit-Merging achieves comparable or superior performance to existing methods while significantly reducing storage requirements. Our framework offers a practical solution for combining specialized models while maintaining their individual strengths and addressing the storage challenges of current approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes