On the Relation of State Space Models and Hidden Markov Models

Aydin Ghojogh, M. Hadi Sepanj, Benyamin Ghojogh

arXiv:2601.13357v11.4

Originality Synthesis-oriented

AI Analysis

This work provides a unified analysis that bridges classical probabilistic models and modern deep learning for sequential data, which is incremental as it synthesizes existing knowledge without introducing new methods.

The paper systematically compares Hidden Markov Models (HMMs), linear Gaussian state space models, Kalman filtering, and modern NLP state space models like S4 and Mamba, analyzing their formulations, inference algorithms, and learning procedures to clarify their relationships and differences.

State Space Models (SSMs) and Hidden Markov Models (HMMs) are foundational frameworks for modeling sequential data with latent variables and are widely used in signal processing, control theory, and machine learning. Despite their shared temporal structure, they differ fundamentally in the nature of their latent states, probabilistic assumptions, inference procedures, and training paradigms. Recently, deterministic state space models have re-emerged in natural language processing through architectures such as S4 and Mamba, raising new questions about the relationship between classical probabilistic SSMs, HMMs, and modern neural sequence models. In this paper, we present a unified and systematic comparison of HMMs, linear Gaussian state space models, Kalman filtering, and contemporary NLP state space models. We analyze their formulations through the lens of probabilistic graphical models, examine their inference algorithms -- including forward-backward inference and Kalman filtering -- and contrast their learning procedures via Expectation-Maximization and gradient-based optimization. By highlighting both structural similarities and semantic differences, we clarify when these models are equivalent, when they fundamentally diverge, and how modern NLP SSMs relate to classical probabilistic models. Our analysis bridges perspectives from control theory, probabilistic modeling, and modern deep learning.

View on arXiv PDF

Similar