LG MLApr 15, 2015

Theory of Dual-sparse Regularized Randomized Reduction

Tianbao Yang, Lijun Zhang, Rong Jin, Shenghuo Zhu

arXiv:1504.03991v46.712 citationsh-index: 48

Originality Incremental advance

AI Analysis

This work addresses the problem of applying randomized reduction methods in broad domains by relaxing strong data assumptions, though it appears incremental as it builds on existing reduction techniques with a new regularization approach.

The paper tackles the limitations of randomized reduction methods for high-dimensional classification by proposing dual-sparse regularized methods, showing that under mild sparsity conditions, the reduced dual solution approximates the original one and concentrates on its support set, with empirical validation and application to reducing communication costs in distributed learning.

In this paper, we study randomized reduction methods, which reduce high-dimensional features into low-dimensional space by randomized methods (e.g., random projection, random hashing), for large-scale high-dimensional classification. Previous theoretical results on randomized reduction methods hinge on strong assumptions about the data, e.g., low rank of the data matrix or a large separable margin of classification, which hinder their applications in broad domains. To address these limitations, we propose dual-sparse regularized randomized reduction methods that introduce a sparse regularizer into the reduced dual problem. Under a mild condition that the original dual solution is a (nearly) sparse vector, we show that the resulting dual solution is close to the original dual solution and concentrates on its support set. In numerical experiments, we present an empirical study to support the analysis and we also present a novel application of the dual-sparse regularized randomized reduction methods to reducing the communication cost of distributed learning from large-scale high-dimensional data.

View on arXiv PDF

Similar