Mercer Large-Scale Kernel Machines from Ridge Function Perspective
This work provides theoretical insights into kernel approximations for machine learning, but it appears incremental as it builds on existing approximation theory results.
The paper analyzes large-scale kernel machines from a ridge function perspective, identifying obstacles in approximating kernels with sums of cosine products and applying the results to image processing using a one-vs-rest procedure.
To present Mercer large-scale kernel machines from a ridge function perspective, we recall the results by Lin and Pinkus from {\it Fundamentality of ridge functions}. We consider the main result of the recent paper by Rachimi and Recht, 2008, {\it Random features for large-scale kernel machines} from the Approximation Theory point of view. We study which kernels could be approximated by a sum of products of cosine functions with arguments depending on $x$ and $y$ and present the obstacles of such an approach. The results of this article are applied to Image Processing by procedure "one-vs-rest".