Deep Multiple Kernel Learning
This addresses the generalization issue for deep learning in data-scarce scenarios, representing an incremental improvement by adapting kernel methods to deep architectures.
The paper tackles the problem of poor generalization of deep neural networks on small datasets by proposing deep multiple kernel learning, which learns multiple layers of kernels and optimizes over an SVM leave-one-out error estimate, resulting in successive performance increases with few base kernels across various datasets.
Deep learning methods have predominantly been applied to large artificial neural networks. Despite their state-of-the-art performance, these large networks typically do not generalize well to datasets with limited sample sizes. In this paper, we take a different approach by learning multiple layers of kernels. We combine kernels at each layer and then optimize over an estimate of the support vector machine leave-one-out error rather than the dual objective function. Our experiments on a variety of datasets show that each layer successively increases performance with only a few base kernels.