Variable Selection and Regularization via Arbitrary Rectangle-range Generalized Elastic NetYujia Ding, Qidi Peng, Zhengming Song et al.
We introduce the arbitrary rectangle-range generalized elastic net penalty method, abbreviated to ARGEN, for performing constrained variable selection and regularization in high-dimensional sparse linear models. As a natural extension of the nonnegative elastic net penalty method, ARGEN is proved to have variable selection consistency and estimation consistency under some conditions. The asymptotic behavior in distribution of the ARGEN estimators have been studied. We also propose an algorithm called MU-QP-RR-W-$l_1$ to efficiently solve ARGEN. By conducting simulation study we show that ARGEN outperforms the elastic net in a number of settings. Finally an application of S&P 500 index tracking with constraints on the stock allocations is performed to provide general guidance for adapting ARGEN to solve real-world problems.
MLAug 5, 2019
Some Developments in Clustering Analysis on Stochastic ProcessesQidi Peng, Nan Rao, Ran Zhao
We review some developments on clustering stochastic processes and come with the conclusion that asymptotically consistent clustering algorithms can be obtained when the processes are ergodic and the dissimilarity measure satisfies the triangle inequality. Examples are provided when the processes are distribution ergodic, covariance ergodic and locally asymptotically self-similar, respectively.
Cluster Analysis on Locally Asymptotically Self-similar Processes with Known Number of ClustersQidi Peng, Nan Rao, Ran Zhao
We conduct cluster analysis on a class of locally asymptotically self-similar stochastic processes, which includes multifractional Brownian motion as a representative. When the true number of clusters is supposed to be known, a new covariance-based dissimilarity measure is introduced, from which we obtain the approximately asymptotically consistent clustering algorithms. In simulation studies, clustering data sampled from multifractional Brownian motions with distinct functional Hurst parameters illustrates the approximated asymptotic consistency of the proposed algorithms. Clustering global financial markets' equity indexes returns and sovereign CDS spreads provides a successful real world application.
Covariance-based Dissimilarity Measures Applied to Clustering Wide-sense Stationary Ergodic ProcessesQidi Peng, Nan Rao, Ran Zhao
We introduce a new unsupervised learning problem: clustering wide-sense stationary ergodic stochastic processes. A covariance-based dissimilarity measure together with asymptotically consistent algorithms is designed for clustering offline and online datasets, respectively. We also suggest a formal criterion on the efficiency of dissimilarity measures, and discuss of some approach to improve the efficiency of our clustering algorithms, when they are applied to cluster particular type of processes, such as self-similar processes with wide-sense stationary ergodic increments. Clustering synthetic data and real-world data are provided as examples of applications.