Clustering piecewise stationary processes
This work addresses a fundamental limitation in non-parametric time-series clustering for researchers and practitioners dealing with data exhibiting long-range dependence and non-stationarity.
The paper tackles the problem of clustering time-series data generated by piecewise stationary ergodic processes, which relax the stationarity assumption for the first time in this context, and proposes simple, computationally efficient algorithms that are proven to be consistent without additional assumptions.
The problem of time-series clustering is considered in the case where each data-point is a sample generated by a piecewise stationary ergodic process. Stationary processes are perhaps the most general class of processes considered in non-parametric statistics and allow for arbitrary long-range dependence between variables. Piecewise stationary processes studied here for the first time in the context of clustering, relax the last remaining assumption in this model: stationarity. A natural formulation is proposed for this problem and a notion of consistency is introduced which requires the samples to be placed in the same cluster if and only if the piecewise stationary distributions that generate them have the same set of stationary distributions. Simple, computationally efficient algorithms are proposed and are shown to be consistent without any additional assumptions beyond piecewise stationarity.