Analysis of Long Range Dependency Understanding in State Space Models
This work provides interpretability insights for state-space models in a domain-specific context, though it is incremental as it focuses on analysis rather than new methods.
The authors conducted the first systematic kernel interpretability study of the diagonalized state-space model (S4D) on a real-world vulnerability detection task, showing that its long-range modeling capability varies significantly with architecture, affecting performance as the kernel can behave as low-pass, band-pass, or high-pass filters.
Although state-space models (SSMs) have demonstrated strong performance on long-sequence benchmarks, most research has emphasized predictive accuracy rather than interpretability. In this work, we present the first systematic kernel interpretability study of the diagonalized state-space model (S4D) trained on a real-world task (vulnerability detection in source code). Through time and frequency domain analysis of the S4D kernel, we show that the long-range modeling capability of S4D varies significantly under different model architectures, affecting model performance. For instance, we show that the depending on the architecture, S4D kernel can behave as low-pass, band-pass or high-pass filter. The insights from our analysis can guide future work in designing better S4D-based models.