Learning What to Memorize: Using Intrinsic Motivation to Form Useful Memory in Partially Observable Reinforcement Learning
This work addresses the problem of state disambiguation in partially observable environments for reinforcement learning agents, offering a novel approach to memory management that could enhance learning efficiency in such settings.
The paper tackles the challenge of long-term dependencies in partially observable reinforcement learning by proposing an agent-controlled memory mechanism, where the agent learns to memorize rare observations using intrinsic motivation, achieving improved performance on several tasks compared to existing memory-based methods.
Reinforcement Learning faces an important challenge in partial observable environments that has long-term dependencies. In order to learn in an ambiguous environment, an agent has to keep previous perceptions in a memory. Earlier memory based approaches use a fixed method to determine what to keep in the memory, which limits them to certain problems. In this study, we follow the idea of giving the control of the memory to the agent by allowing it to have memory-changing actions. This learning mechanism is supported by an intrinsic motivation to memorize rare observations that can help the agent to disambiguate its state in the environment. Our approach is experimented and analyzed on several partial observable tasks with long-term dependencies and compared with other memory based methods.