Sparse Learning for Large-scale and High-dimensional Data: A Randomized Convex-concave Optimization Approach
This addresses the challenge of scalability and dimensionality in sparse learning for applications like classification, but it appears incremental as it builds on existing methods like random projection and regularization.
The paper tackles the problem of learning sparse models from large-scale, high-dimensional data by developing a randomized convex-concave optimization approach that combines random projection with ℓ1-norm regularization, and theoretical analysis shows it can accurately recover optimal primal and dual solutions under favored conditions.
In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer. Under the assumption that there exists a (approximately) sparse solution with high classification accuracy, we argue that the dual solution is also sparse or approximately sparse. The fact that both primal and dual solutions are sparse motivates us to develop a randomized approach for a general convex-concave optimization problem. Specifically, the proposed approach combines the strength of random projection with that of sparse learning: it utilizes random projection to reduce the dimensionality, and introduces $\ell_1$-norm regularization to alleviate the approximation error caused by random projection. Theoretical analysis shows that under favored conditions, the randomized algorithm can accurately recover the optimal solutions to the convex-concave optimization problem (i.e., recover both the primal and dual solutions).