Streaming Inference for Infinite Non-Stationary Clustering
This addresses the challenge of learning from continuous, evolving data streams for intelligent agents, representing an incremental advance in clustering methods.
The paper tackles the problem of unsupervised, streaming, and non-stationary clustering by introducing a novel algorithm that creates new clusters online in a probabilistic manner, demonstrating its application on diverse synthetic and real data with Gaussian and non-Gaussian likelihoods.
Learning from a continuous stream of non-stationary data in an unsupervised manner is arguably one of the most common and most challenging settings facing intelligent agents. Here, we attack learning under all three conditions (unsupervised, streaming, non-stationary) in the context of clustering, also known as mixture modeling. We introduce a novel clustering algorithm that endows mixture models with the ability to create new clusters online, as demanded by the data, in a probabilistic, time-varying, and principled manner. To achieve this, we first define a novel stochastic process called the Dynamical Chinese Restaurant Process (Dynamical CRP), which is a non-exchangeable distribution over partitions of a set; next, we show that the Dynamical CRP provides a non-stationary prior over cluster assignments and yields an efficient streaming variational inference algorithm. We conclude with experiments showing that the Dynamical CRP can be applied on diverse synthetic and real data with Gaussian and non-Gaussian likelihoods.