MSLGMay 28

libhmm: A Modern C++20 Library for Hidden Markov Models with Correct MLE Emission M-Steps

arXiv:2605.292087.0
Predicted impact top 72% in MS · last 90 daysOriginality Synthesis-oriented
AI Analysis

For developers needing a production-ready, zero-dependency C++ HMM library with accurate parameter estimation, libhmm fills a gap by offering correct MLE and SIMD acceleration.

libhmm is a C++20 HMM library that provides correct MLE emission M-steps for 16 distributions, avoiding method-of-moments approximations. Benchmarks show it matches or outperforms existing libraries on five real datasets.

We describe libhmm, a C++20 library for Hidden Markov Model parameter estimation, sequence decoding, and model selection. libhmm addresses two gaps in existing software: the absence of a well-maintained, zero-dependency C++ HMM library suitable for embedding in production systems, and the widespread use of method-of-moments (MOM) approximations in the emission distribution M-step of the Baum-Welch algorithm. The library implements correct maximum likelihood estimators for sixteen continuous and discrete emission distributions, including an ECME algorithm for the location-scale Student-t distribution, Newton-Raphson maximization for Gamma, Beta, Weibull, and Negative Binomial distributions, and the von Mises distribution for circular data. All forward-backward and Viterbi calculations operate in full log-space. SIMD acceleration is provided for AVX-512, AVX2, SSE2, and ARM NEON via compile-time dispatch with scalar fallback. Python bindings are available via the companion package pylibhmm. We compare libhmm against established C and C++ HMM libraries and against published R reference packages on five real-data benchmarks, and discuss the architectural tradeoffs made in the design.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes