LGCLOct 16, 2024

Rethinking Token Reduction for State Space Models

Harvard
arXiv:2410.14725v131 citationsh-index: 18EMNLP
Originality Incremental advance
AI Analysis

This work addresses efficiency challenges for SSMs, which are important for applications requiring long-range dependencies, but it is incremental as it builds on existing token reduction methods.

The paper tackled the problem of applying token reduction techniques to State Space Models (SSMs) like Mamba, which caused performance drops, by proposing a tailored method that integrates token importance and similarity for intra-layer reduction. The result was an average accuracy improvement of 5.7% to 13.1% on six benchmarks with Mamba-2, while reducing computational and memory demands.

Recent advancements in State Space Models (SSMs) have attracted significant interest, particularly in models optimized for parallel training and handling long-range dependencies. Architectures like Mamba have scaled to billions of parameters with selective SSM. To facilitate broader applications using Mamba, exploring its efficiency is crucial. While token reduction techniques offer a straightforward post-training strategy, we find that applying existing methods directly to SSMs leads to substantial performance drops. Through insightful analysis, we identify the reasons for this failure and the limitations of current techniques. In response, we propose a tailored, unified post-training token reduction method for SSMs. Our approach integrates token importance and similarity, thus taking advantage of both pruning and merging, to devise a fine-grained intra-layer token reduction strategy. Extensive experiments show that our method improves the average accuracy by 5.7% to 13.1% on six benchmarks with Mamba-2 compared to existing methods, while significantly reducing computational demands and memory requirements.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes