Unraveling the Complexity of Memory in RL Agents: an Approach for Classification and Evaluation
This work addresses a foundational problem for RL researchers by enabling objective comparison of memory-enhanced agents, though it is incremental in refining existing concepts rather than introducing new paradigms.
The paper tackles the lack of unified definitions and evaluation methods for memory in reinforcement learning agents by providing precise classifications and a standardized experimental methodology, demonstrating its importance through empirical tests with various RL agents.
The incorporation of memory into agents is essential for numerous tasks within the domain of Reinforcement Learning (RL). In particular, memory is paramount for tasks that require the utilization of past information, adaptation to novel environments, and improved sample efficiency. However, the term ``memory'' encompasses a wide range of concepts, which, coupled with the lack of a unified methodology for validating an agent's memory, leads to erroneous judgments about agents' memory capabilities and prevents objective comparison with other memory-enhanced agents. This paper aims to streamline the concept of memory in RL by providing practical precise definitions of agent memory types, such as long-term versus short-term memory and declarative versus procedural memory, inspired by cognitive science. Using these definitions, we categorize different classes of agent memory, propose a robust experimental methodology for evaluating the memory capabilities of RL agents, and standardize evaluations. Furthermore, we empirically demonstrate the importance of adhering to the proposed methodology when evaluating different types of agent memory by conducting experiments with different RL agents and what its violation leads to.