CL CE LGMay 29, 2025

The Surprising Soupability of Documents in State Space Models

Yasaman Jafari, Zixian Wang, Leon Bergen, Taylor Berg-Kirkpatrick

arXiv:2505.24033v12.7h-index: 4

Originality Incremental advance

AI Analysis

This addresses efficiency in document processing for downstream tasks like QA and retrieval, though it appears incremental as it builds on existing model souping and SSM techniques.

The paper tackles the problem of efficiently reusing document representations in State Space Models by proposing document souping, where hidden states from independently encoded documents are pooled via simple operations like averaging. The result shows strong accuracy on multi-hop QA, sparse retrieval, and long-document reasoning, with souping ten documents on HotpotQA nearly matching cross-encoder performance.

We investigate whether hidden states from Structured State Space Models (SSMs) can be merged post-hoc to support downstream reasoning. Inspired by model souping, we propose a strategy where documents are encoded independently and their representations are pooled -- via simple operations like averaging -- into a single context state. This approach, which we call document souping, enables modular encoding and reuse without reprocessing the full input for each query. We finetune Mamba2 models to produce soupable representations and find that they support multi-hop QA, sparse retrieval, and long-document reasoning with strong accuracy. On HotpotQA, souping ten independently encoded documents nearly matches the performance of a cross-encoder trained on the same inputs.

View on arXiv PDF

Similar