GNCVLGApr 7, 2025

Leveraging State Space Models in Long Range Genomics

arXiv:2504.06304v21 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses computational bottlenecks in genomic analysis for researchers, though it is incremental as it benchmarks existing SSM architectures against transformers.

The paper tackled the problem of modeling long-range dependencies in genomics, where conventional transformer-based methods struggle with computational complexity and sequence length extrapolation. The researchers found that State Space Models (SSMs) matched transformer performance, handled contexts 10 to 100 times longer than training sequences, and could process 1M tokens efficiently on a single GPU.

Long-range dependencies are critical for understanding genomic structure and function, yet most conventional methods struggle with them. Widely adopted transformer-based models, while excelling at short-context tasks, are limited by the attention module's quadratic computational complexity and inability to extrapolate to sequences longer than those seen in training. In this work, we explore State Space Models (SSMs) as a promising alternative by benchmarking two SSM-inspired architectures, Caduceus and Hawk, on long-range genomics modeling tasks under conditions parallel to a 50M parameter transformer baseline. We discover that SSMs match transformer performance and exhibit impressive zero-shot extrapolation across multiple tasks, handling contexts 10 to 100 times longer than those seen during training, indicating more generalizable representations better suited for modeling the long and complex human genome. Moreover, we demonstrate that these models can efficiently process sequences of 1M tokens on a single GPU, allowing for modeling entire genomic regions at once, even in labs with limited compute. Our findings establish SSMs as efficient and scalable for long-context genomic analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes