LGAIJan 22, 2025

Adaptive Data Exploitation in Deep Reinforcement Learning

arXiv:2501.12620v11 citationsh-index: 11Has Code
Originality Incremental advance
AI Analysis

This addresses data inefficiency in deep reinforcement learning, offering a practical solution for researchers and practitioners, though it appears incremental as it builds on existing RL methods with adaptive data management.

The paper tackles data efficiency and generalization in deep reinforcement learning by introducing ADEPT, a framework that adaptively manages data usage with multi-armed bandit algorithms, achieving superior performance and computational efficiency on benchmarks like Procgen, MiniGrid, and PyBullet.

We introduce ADEPT: Adaptive Data ExPloiTation, a simple yet powerful framework to enhance the **data efficiency** and **generalization** in deep reinforcement learning (RL). Specifically, ADEPT adaptively manages the use of sampled data across different learning stages via multi-armed bandit (MAB) algorithms, optimizing data utilization while mitigating overfitting. Moreover, ADEPT can significantly reduce the computational overhead and accelerate a wide range of RL algorithms. We test ADEPT on benchmarks including Procgen, MiniGrid, and PyBullet. Extensive simulation demonstrates that ADEPT can achieve superior performance with remarkable computational efficiency, offering a practical solution to data-efficient RL. Our code is available at https://github.com/yuanmingqi/ADEPT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes