Fast Bounded Online Gradient Descent Algorithms for Scalable Kernel-Based Online Learning
This work addresses the scalability problem for applications with large-scale datasets in kernel-based online learning, though it is incremental as it builds on existing bounded learning methods.
The authors tackled the scalability issue in kernel-based online learning, which suffers from an unbounded number of support vectors, by proposing bounded online gradient descent algorithms (BOGD and BOGD++) that constrain support vectors to a predefined budget, achieving promising empirical performance in efficacy and efficiency on large-scale datasets.
Kernel-based online learning has often shown state-of-the-art performance for many online learning tasks. It, however, suffers from a major shortcoming, that is, the unbounded number of support vectors, making it non-scalable and unsuitable for applications with large-scale datasets. In this work, we study the problem of bounded kernel-based online learning that aims to constrain the number of support vectors by a predefined budget. Although several algorithms have been proposed in literature, they are neither computationally efficient due to their intensive budget maintenance strategy nor effective due to the use of simple Perceptron algorithm. To overcome these limitations, we propose a framework for bounded kernel-based online learning based on an online gradient descent approach. We propose two efficient algorithms of bounded online gradient descent (BOGD) for scalable kernel-based online learning: (i) BOGD by maintaining support vectors using uniform sampling, and (ii) BOGD++ by maintaining support vectors using non-uniform sampling. We present theoretical analysis of regret bound for both algorithms, and found promising empirical performance in terms of both efficacy and efficiency by comparing them to several well-known algorithms for bounded kernel-based online learning on large-scale datasets.