A Distributed Algorithm for Training Nonlinear Kernel Machines
This work addresses the challenge of scaling kernel methods for big data applications, though it appears incremental as it builds on existing Nyström and gradient-based approaches.
The paper tackles the problem of distributed training for nonlinear kernel machines on Map-Reduce by proposing a re-formulation of Nyström approximation solved with gradient techniques, demonstrating its value on large benchmark datasets.
This paper concerns the distributed training of nonlinear kernel machines on Map-Reduce. We show that a re-formulation of Nyström approximation based solution which is solved using gradient based techniques is well suited for this, especially when it is necessary to work with a large number of basis points. The main advantages of this approach are: avoidance of computing the pseudo-inverse of the kernel sub-matrix corresponding to the basis points; simplicity and efficiency of the distributed part of the computations; and, friendliness to stage-wise addition of basis points. We implement the method using an AllReduce tree on Hadoop and demonstrate its value on a few large benchmark datasets.