LGAug 25, 2022
Time Series Clustering with an EM algorithm for Mixtures of Linear Gaussian State Space ModelsRyohei Umatani, Takashi Imai, Kaoru Kawamoto et al.
In this paper, we consider the task of clustering a set of individual time series while modeling each cluster, that is, model-based time series clustering. The task requires a parametric model with sufficient flexibility to describe the dynamics in various time series. To address this problem, we propose a novel model-based time series clustering method with mixtures of linear Gaussian state space models, which have high flexibility. The proposed method uses a new expectation-maximization algorithm for the mixture model to estimate the model parameters, and determines the number of clusters using the Bayesian information criterion. Experiments on a simulated dataset demonstrate the effectiveness of the method in clustering, parameter estimation, and model selection. The method is applied to real datasets commonly used to evaluate time series clustering methods. Results showed that the proposed method produces clustering results that are as accurate or more accurate than those obtained using previous methods.
LGJun 29, 2024
Time Series Clustering with General State Space Models via Stochastic Variational InferenceRyoichi Ishizuka, Takashi Imai, Kaoru Kawamoto
In this paper, we propose a novel method of model-based time series clustering with mixtures of general state space models (MSSMs). Each component of MSSMs is associated with each cluster. An advantage of the proposed method is that it enables the use of time series models appropriate to the specific time series. This not only improves clustering and prediction accuracy but also enhances the interpretability of the estimated parameters. The parameters of the MSSMs are estimated using stochastic variational inference, a subtype of variational inference. The proposed method estimates the latent variables of an arbitrary state space model by using neural networks with a normalizing flow as a variational estimator. The number of clusters can be estimated using the Bayesian information criterion. In addition, to prevent MSSMs from converging to the local optimum, we propose several optimization tricks, including an additional penalty term called entropy annealing. To our best knowledge, the proposed method is the first computationally feasible one for time series clustering based on general (possibly nonlinear, non-Gaussian) state space models. Experiments on simulated datasets show that the proposed method is effective for clustering, parameter estimation, and estimating the number of clusters.