MLAug 5, 2019
Some Developments in Clustering Analysis on Stochastic ProcessesQidi Peng, Nan Rao, Ran Zhao
We review some developments on clustering stochastic processes and come with the conclusion that asymptotically consistent clustering algorithms can be obtained when the processes are ergodic and the dissimilarity measure satisfies the triangle inequality. Examples are provided when the processes are distribution ergodic, covariance ergodic and locally asymptotically self-similar, respectively.
MLApr 13, 2018
Cluster Analysis on Locally Asymptotically Self-similar Processes with Known Number of ClustersQidi Peng, Nan Rao, Ran Zhao
We conduct cluster analysis on a class of locally asymptotically self-similar stochastic processes, which includes multifractional Brownian motion as a representative. When the true number of clusters is supposed to be known, a new covariance-based dissimilarity measure is introduced, from which we obtain the approximately asymptotically consistent clustering algorithms. In simulation studies, clustering data sampled from multifractional Brownian motions with distinct functional Hurst parameters illustrates the approximated asymptotic consistency of the proposed algorithms. Clustering global financial markets' equity indexes returns and sovereign CDS spreads provides a successful real world application.
MLJan 27, 2018
Covariance-based Dissimilarity Measures Applied to Clustering Wide-sense Stationary Ergodic ProcessesQidi Peng, Nan Rao, Ran Zhao
We introduce a new unsupervised learning problem: clustering wide-sense stationary ergodic stochastic processes. A covariance-based dissimilarity measure together with asymptotically consistent algorithms is designed for clustering offline and online datasets, respectively. We also suggest a formal criterion on the efficiency of dissimilarity measures, and discuss of some approach to improve the efficiency of our clustering algorithms, when they are applied to cluster particular type of processes, such as self-similar processes with wide-sense stationary ergodic increments. Clustering synthetic data and real-world data are provided as examples of applications.