Sparse Multiple Kernel Learning with Geometric Convergence Rate
This work addresses efficient kernel selection for classification, but it appears incremental as it builds on existing MKL methods with a focus on improved convergence.
The paper tackles the problem of sparse multiple kernel learning by developing a greedy coordinate descent algorithm that achieves a geometric convergence rate under certain conditions, with a generalization error bound derived using local Rademacher complexity.
In this paper, we study the problem of sparse multiple kernel learning (MKL), where the goal is to efficiently learn a combination of a fixed small number of kernels from a large pool that could lead to a kernel classifier with a small prediction error. We develop an efficient algorithm based on the greedy coordinate descent algorithm, that is able to achieve a geometric convergence rate under appropriate conditions. The convergence rate is achieved by measuring the size of functional gradients by an empirical $\ell_2$ norm that depends on the empirical data distribution. This is in contrast to previous algorithms that use a functional norm to measure the size of gradients, which is independent from the data samples. We also establish a generalization error bound of the learned sparse kernel classifier using the technique of local Rademacher complexity.