MLITLGNov 10, 2022

Regret Bounds for Noise-Free Cascaded Kernelized Bandits

arXiv:2211.05430v23 citationsh-index: 31
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient optimization in structured function networks for researchers in bandit theory and machine learning, offering incremental theoretical advances with specific regret bounds.

The paper tackles optimizing function networks in a noise-free grey-box setting with known structure, proposing algorithms like GPN-UCB and providing theoretical regret bounds that match or improve on known dependencies for parameters such as time horizon and network length.

We consider optimizing a function network in the noise-free grey-box setting with RKHS function classes, where the exact intermediate results are observable. We assume that the structure of the network is known (but not the underlying functions comprising it), and we study three types of structures: (1) chain: a cascade of scalar-valued functions, (2) multi-output chain: a cascade of vector-valued functions, and (3) feed-forward network: a fully connected feed-forward network of scalar-valued functions. We propose a sequential upper confidence bound based algorithm GPN-UCB along with a general theoretical upper bound on the cumulative regret. In addition, we propose a non-adaptive sampling based method along with its theoretical upper bound on the simple regret for the Matérn kernel. We also provide algorithm-independent lower bounds on the simple regret and cumulative regret. Our regret bounds for GPN-UCB have the same dependence on the time horizon as the best known in the vanilla black-box setting, as well as near-optimal dependencies on other parameters (e.g., RKHS norm and network length).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes