A core-set approach for distributed quadratic programming in big-data classification
For machine learning in cyber-physical networks, this provides a scalable distributed solution to high-dimensional classification problems, though the novelty is incremental.
The paper tackles distributed quadratic programming for big-data classification, proposing an asynchronous algorithm that scales with both sample size and dimension by using core-sets to approximate solutions. The method achieves consensus on active constraints, enabling efficient distributed optimization.
A new challenge for learning algorithms in cyber-physical network systems is the distributed solution of big-data classification problems, i.e., problems in which both the number of training samples and their dimension is high. Motivated by several problem set-ups in Machine Learning, in this paper we consider a special class of quadratic optimization problems involving a "large" number of input data, whose dimension is "big". To solve these quadratic optimization problems over peer-to-peer networks, we propose an asynchronous, distributed algorithm that scales with both the number and the dimension of the input data (training samples in the classification problem). The proposed distributed optimization algorithm relies on the notion of "core-set" which is used in geometric optimization to approximate the value function associated to a given set of points with a smaller subset of points. By computing local core-sets on a smaller version of the global problem and exchanging them with neighbors, the nodes reach consensus on a set of active constraints representing an approximate solution for the global quadratic program.