LGJun 30, 2024

Adversarial Online Learning with Temporal Feedback Graphs

arXiv:2407.00571v12 citations
Originality Highly original
AI Analysis

This work addresses a fundamental challenge in online learning for scenarios with structured feedback delays, which is incremental but provides optimal solutions for important graph classes.

The paper tackles the problem of adversarial online learning with temporal feedback graphs, where the learner's actions depend only on losses from specific past rounds. It presents a novel algorithm that achieves optimal regret bounds for transitive feedback graphs, with tight lower bounds in many practical settings.

We study a variant of prediction with expert advice where the learner's action at round $t$ is only allowed to depend on losses on a specific subset of the rounds (where the structure of which rounds' losses are visible at time $t$ is provided by a directed "feedback graph" known to the learner). We present a novel learning algorithm for this setting based on a strategy of partitioning the losses across sub-cliques of this graph. We complement this with a lower bound that is tight in many practical settings, and which we conjecture to be within a constant factor of optimal. For the important class of transitive feedback graphs, we prove that this algorithm is efficiently implementable and obtains the optimal regret bound (up to a universal constant).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes