MLOHCOMEApr 22, 2016

Non-convex Global Minimization and False Discovery Rate Control for the TREX

arXiv:1604.06815v222 citations
Originality Highly original
AI Analysis

This provides a rare example of globally optimizing a non-convex problem and advances statistical inference for high-dimensional data, though it is incremental in extending existing frameworks like the knockoff filter.

The paper tackles the computational challenge of the TREX, a non-convex optimization problem for sparse high-dimensional regression, by developing a polynomial-time algorithm guaranteed to find the global minimum, and uses this to enable a variable selection scheme that controls false discovery rate.

The TREX is a recently introduced method for performing sparse high-dimensional regression. Despite its statistical promise as an alternative to the lasso, square-root lasso, and scaled lasso, the TREX is computationally challenging in that it requires solving a non-convex optimization problem. This paper shows a remarkable result: despite the non-convexity of the TREX problem, there exists a polynomial-time algorithm that is guaranteed to find the global minimum. This result adds the TREX to a very short list of non-convex optimization problems that can be globally optimized (principal components analysis being a famous example). After deriving and developing this new approach, we demonstrate that (i) the ability of the preexisting TREX heuristic to reach the global minimum is strongly dependent on the difficulty of the underlying statistical problem, (ii) the new polynomial-time algorithm for TREX permits a novel variable ranking and selection scheme, (iii) this scheme can be incorporated into a rule that controls the false discovery rate (FDR) of included features in the model. To achieve this last aim, we provide an extension of the results of Barber & Candes (2015) to establish that the knockoff filter framework can be applied to the TREX. This investigation thus provides both a rare case study of a heuristic for non-convex optimization and a novel way of exploiting non-convexity for statistical inference.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes