LGAIMay 8

Continuity Laws for Sequential Models

arXiv:2605.0853960.6
Predicted impact top 36% in LG · last 90 daysOriginality Incremental advance
AI Analysis

For researchers working on sequential models, this work identifies and quantifies an underexplored inductive bias (continuity in time) that correlates with performance on temporally structured tasks.

The paper formalizes model continuity as convergence under temporal refinement and shows that S4 exhibits stable continuous behavior while S6 (Mamba) is more sensitive. It introduces a metric to quantify task continuity and finds empirical alignment between task continuity, model continuity, and performance, also showing that continuity enables a temporal subsampling strategy improving efficiency and performance.

Inductive biases influence the behavior and performance of sequential models. In this work, we study an underexplored inductive bias in sequential modeling: continuity in time. We ask a simple question: do models motivated by continuous-time formulations, such as state-space models, actually behave continuously in time, and does this translate into better performance on tasks with continuous temporal structure? To answer this, we formalize model continuity as convergence under temporal refinement, where a model is continuous if its predictions approach an underlying continuous trajectory as the temporal discretization is refined. We show that S4 exhibits stable continuous behavior, whereas S6 (the core of Mamba) can be more sensitive to input amplitude and selective dynamics, despite being derived from a continuous dynamical system. To study whether this distinction matters for learning, we also need a corresponding notion of task continuity. We therefore introduce a metric to quantify the continuity of datasets directly from their temporal structure. Across benchmarks, we find a clear empirical alignment between task continuity, model continuity, and model performance. Beyond an inductive bias, continuity also has practical consequences: we show that it enables a simple temporal subsampling strategy that improves both efficiency and performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes