LGAIMLMay 29, 2019

Advantage Amplification in Slowly Evolving Latent-State Environments

arXiv:1905.13559v19 citations
Originality Incremental advance
AI Analysis

This work addresses a domain-specific problem for RL in slowly evolving latent-state environments, offering incremental improvements through novel aggregation methods.

The paper tackled the challenge of reinforcement learning in latent-state environments with long horizons, such as recommender systems, by developing an advantage amplification principle using temporal abstraction, and demonstrated its performance in a user-modeling task.

Latent-state environments with long horizons, such as those faced by recommender systems, pose significant challenges for reinforcement learning (RL). In this work, we identify and analyze several key hurdles for RL in such environments, including belief state error and small action advantage. We develop a general principle of advantage amplification that can overcome these hurdles through the use of temporal abstraction. We propose several aggregation methods and prove they induce amplification in certain settings. We also bound the loss in optimality incurred by our methods in environments where latent state evolves slowly and demonstrate their performance empirically in a stylized user-modeling task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes