DS AI LGMar 3, 2023

Near Optimal Memory-Regret Tradeoff for Online Learning

arXiv:2303.01673v211.715 citationsh-index: 29

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient online learning under memory constraints, which is crucial for applications in resource-limited systems, though it builds incrementally on prior work.

The paper tackles the problem of online learning with limited memory, improving the memory-regret tradeoff against an oblivious adversary to nearly match known lower bounds, and establishing that roughly √n memory is necessary and sufficient for achieving sublinear regret against an adaptive adversary.

In the experts problem, on each of $T$ days, an agent needs to follow the advice of one of $n$ ``experts''. After each day, the loss associated with each expert's advice is revealed. A fundamental result in learning theory says that the agent can achieve vanishing regret, i.e. their cumulative loss is within $o(T)$ of the cumulative loss of the best-in-hindsight expert. Can the agent perform well without sufficient space to remember all the experts? We extend a nascent line of research on this question in two directions: $\bullet$ We give a new algorithm against the oblivious adversary, improving over the memory-regret tradeoff obtained by [PZ23], and nearly matching the lower bound of [SWXZ22]. $\bullet$ We also consider an adaptive adversary who can observe past experts chosen by the agent. In this setting we give both a new algorithm and a novel lower bound, proving that roughly $\sqrt{n}$ memory is both necessary and sufficient for obtaining $o(T)$ regret.

View on arXiv PDF

Similar