Clustering Time Series and the Surprising Robustness of HMMs
This provides a robust method for clustering time series in non-stationary settings, which is incremental as it extends HMM applicability beyond its traditional assumptions.
The paper tackles the problem of estimating source distributions in time series where the source changes occasionally, showing that a maximum likelihood HMM estimator can approximate these distributions even when data lacks Markov or stationarity properties, producing correct second moments and extending to higher moments.
Suppose that we are given a time series where consecutive samples are believed to come from a probabilistic source, that the source changes from time to time and that the total number of sources is fixed. Our objective is to estimate the distributions of the sources. A standard approach to this problem is to model the data as a hidden Markov model (HMM). However, since the data often lacks the Markov or the stationarity properties of an HMM, one can ask whether this approach is still suitable or perhaps another approach is required. In this paper we show that a maximum likelihood HMM estimator can be used to approximate the source distributions in a much larger class of models than HMMs. Specifically, we propose a natural and fairly general non-stationary model of the data, where the only restriction is that the sources do not change too often. Our main result shows that for this model, a maximum-likelihood HMM estimator produces the correct second moment of the data, and the results can be extended to higher moments.