Learning Hidden Markov Models from Aggregate Observations
This work addresses the problem of learning individual HMM parameters from population-level aggregate data, which is relevant for researchers and practitioners dealing with privacy-preserving data or situations where only macroscopic observations are available.
This paper introduces an algorithm to estimate parameters of a time-homogeneous hidden Markov model (HMM) using only aggregate population counts, rather than individual observations. The algorithm, based on expectation-maximization and Sinkhorn belief propagation, offers convergence guarantees and generalizes to HMMs with continuous observations.
In this paper, we propose an algorithm for estimating the parameters of a time-homogeneous hidden Markov model from aggregate observations. This problem arises when only the population level counts of the number of individuals at each time step are available, from which one seeks to learn the individual hidden Markov model. Our algorithm is built upon expectation-maximization and the recently proposed aggregate inference algorithm, the Sinkhorn belief propagation. As compared with existing methods such as expectation-maximization with non-linear belief propagation, our algorithm exhibits convergence guarantees. Moreover, our learning framework naturally reduces to the standard Baum-Welch learning algorithm when observations corresponding to a single individual are recorded. We further extend our learning algorithm to handle HMMs with continuous observations. The efficacy of our algorithm is demonstrated on a variety of datasets.