Efficient Diversity-based Experience Replay for Deep Reinforcement Learning
This addresses efficiency issues in reinforcement learning for real-world scenarios with high-dimensional state spaces, representing an incremental improvement over existing replay methods.
The paper tackled the problem of low efficiency in experience replay for reinforcement learning by proposing Efficient Diversity-based Experience Replay (EDER), which uses a determinantal point process to prioritize diverse samples, resulting in significant improvements in learning efficiency and superior performance in high-dimensional environments like MuJoCo, Atari, and Habitat.
Experience replay is widely used to improve learning efficiency in reinforcement learning by leveraging past experiences. However, existing experience replay methods, whether based on uniform or prioritized sampling, often suffer from low efficiency, particularly in real-world scenarios with high-dimensional state spaces. To address this limitation, we propose a novel approach, Efficient Diversity-based Experience Replay (EDER). EDER employs a determinantal point process to model the diversity between samples and prioritizes replay based on the diversity between samples. To further enhance learning efficiency, we incorporate Cholesky decomposition for handling large state spaces in realistic environments. Additionally, rejection sampling is applied to select samples with higher diversity, thereby improving overall learning efficacy. Extensive experiments are conducted on robotic manipulation tasks in MuJoCo, Atari games, and realistic indoor environments in Habitat. The results demonstrate that our approach not only significantly improves learning efficiency but also achieves superior performance in high-dimensional, realistic environments.