ML IT LGDec 2, 2016

Restricted Strong Convexity Implies Weak Submodularity

Ethan R. Elenberg, Rajiv Khanna, Alexandros G. Dimakis, Sahand Negahban

arXiv:1612.00804v227.3163 citations

Originality Incremental advance

AI Analysis

This work provides theoretical guarantees for greedy feature selection methods in machine learning, applicable to various statistical learning problems with combinatorial structure.

The paper connects high-dimensional subset selection to submodular maximization, extending prior work from linear regression to arbitrary objective functions, and shows that greedy algorithms achieve constant-factor approximations to the optimal subset-selection solution for a broad class of functions.

We connect high-dimensional subset selection and submodular maximization. Our results extend the work of Das and Kempe (2011) from the setting of linear regression to arbitrary objective functions. For greedy feature selection, this connection allows us to obtain strong multiplicative performance bounds on several methods without statistical modeling assumptions. We also derive recovery guarantees of this form under standard assumptions. Our work shows that greedy algorithms perform within a constant factor from the best possible subset-selection solution for a broad class of general objective functions. Our methods allow a direct control over the number of obtained features as opposed to regularization parameters that only implicitly control sparsity. Our proof technique uses the concept of weak submodularity initially defined by Das and Kempe. We draw a connection between convex analysis and submodular set function theory which may be of independent interest for other statistical learning applications that have combinatorial structure.

View on arXiv PDF

Similar