LGJul 1, 2024

Learnability in Online Kernel Selection with Memory Constraint via Data-dependent Regret Analysis

arXiv:2407.00916v3h-index: 3
Originality Incremental advance
AI Analysis

This work addresses the fundamental problem of online kernel selection for machine learning practitioners, providing theoretical insights into learnability under memory limits, though it is incremental as it builds on prior worst-case analyses with new data-dependent bounds.

The paper tackles online kernel selection under memory constraints by deriving data-dependent regret bounds based on kernel alignment and cumulative losses of competitive hypotheses, showing that learning is possible with small memory if these complexities are sub-linear, and includes empirical validation on benchmarks.

Online kernel selection is a fundamental problem of online kernel methods.In this paper,we study online kernel selection with memory constraint in which the memory of kernel selection and online prediction procedures is limited to a fixed budget. An essential question is what is the intrinsic relationship among online learnability, memory constraint, and data complexity? To answer the question,it is necessary to show the trade-offs between regret and memory constraint.Previous work gives a worst-case lower bound depending on the data size,and shows learning is impossible within a small memory constraint.In contrast, we present distinct results by offering data-dependent upper bounds that rely on two data complexities:kernel alignment and the cumulative losses of competitive hypothesis.We propose an algorithmic framework giving data-dependent upper bounds for two types of loss functions.For the hinge loss function,our algorithm achieves an expected upper bound depending on kernel alignment.For smooth loss functions,our algorithm achieves a high-probability upper bound depending on the cumulative losses of competitive hypothesis.We also prove a matching lower bound for smooth loss functions.Our results show that if the two data complexities are sub-linear,then learning is possible within a small memory constraint.Our algorithmic framework depends on a new buffer maintaining framework and a reduction from online kernel selection to prediction with expert advice. Finally,we empirically verify the prediction performance of our algorithms on benchmark datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes