Simultaneous Best Subset Selection and Dimension Reduction via Primal-Dual Iterations
This addresses a fundamental computational-statistical gap in high-dimensional regression for statistical learning applications, though it appears incremental relative to existing sparse reduced rank regression methods.
The paper tackles the discrepancy between theoretical guarantees and numerical computation in sparse reduced rank regression by developing a new algorithm that achieves almost optimal estimation rates with polynomial iteration complexity, and demonstrates effectiveness on ovarian cancer genetic data.
Sparse reduced rank regression is an essential statistical learning method. In the contemporary literature, estimation is typically formulated as a nonconvex optimization that often yields to a local optimum in numerical computation. Yet, their theoretical analysis is always centered on the global optimum, resulting in a discrepancy between the statistical guarantee and the numerical computation. In this research, we offer a new algorithm to address the problem and establish an almost optimal rate for the algorithmic solution. We also demonstrate that the algorithm achieves the estimation with a polynomial number of iterations. In addition, we present a generalized information criterion to simultaneously ensure the consistency of support set recovery and rank estimation. Under the proposed criterion, we show that our algorithm can achieve the oracle reduced rank estimation with a significant probability. The numerical studies and an application in the ovarian cancer genetic data demonstrate the effectiveness and scalability of our approach.