LGAIOct 18, 2024

Online Reinforcement Learning with Passive Memory

arXiv:2410.14665v1h-index: 34ACC
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient online learning in reinforcement learning for practitioners, though it appears incremental by building on existing methods with passive data.

The paper tackles the problem of online reinforcement learning by incorporating pre-collected data (passive memory) to improve performance, achieving near-minimax optimal regret with results showing that regret sub-optimality depends on memory quality, applicable in both continuous and discrete spaces.

This paper considers an online reinforcement learning algorithm that leverages pre-collected data (passive memory) from the environment for online interaction. We show that using passive memory improves performance and further provide theoretical guarantees for regret that turns out to be near-minimax optimal. Results show that the quality of passive memory determines sub-optimality of the incurred regret. The proposed approach and results hold in both continuous and discrete state-action spaces.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes