LG DC MLFeb 10, 2018

Feature-Distributed SVRG for High-Dimensional Linear Classification

Gong-Duo Zhang, Shen-Yi Zhao, Hao Gao, Wu-Jun Li

arXiv:1802.03604v17.118 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient distributed learning for high-dimensional applications like text classification, representing an incremental improvement over existing methods.

The paper tackles the problem of high-dimensional linear classification by proposing a feature-distributed method called FD-SVRG, which reduces communication cost and wall-clock time compared to instance-distributed methods when data dimensionality exceeds the number of instances.

Linear classification has been widely used in many high-dimensional applications like text classification. To perform linear classification for large-scale tasks, we often need to design distributed learning methods on a cluster of multiple machines. In this paper, we propose a new distributed learning method, called feature-distributed stochastic variance reduced gradient (FD-SVRG) for high-dimensional linear classification. Unlike most existing distributed learning methods which are instance-distributed, FD-SVRG is feature-distributed. FD-SVRG has lower communication cost than other instance-distributed methods when the data dimensionality is larger than the number of data instances. Experimental results on real data demonstrate that FD-SVRG can outperform other state-of-the-art distributed methods for high-dimensional linear classification in terms of both communication cost and wall-clock time, when the dimensionality is larger than the number of instances in training data.

View on arXiv PDF

Similar