LG AIMay 22, 2022

Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation

Qingfeng Lan, Yangchen Pan, Jun Luo, A. Rupam Mahmood

arXiv:2205.10868v511.110 citationsh-index: 39Has Code

Originality Incremental advance

AI Analysis

This work addresses memory constraints for reinforcement learning on edge devices, but it is incremental as it builds on existing deep Q-network methods.

The paper tackles the problem of catastrophic forgetting and high memory usage in deep reinforcement learning by proposing memory-efficient algorithms that consolidate knowledge from target to current Q-networks, achieving comparable or better performance in feature-based and image-based tasks while reducing reliance on large replay buffers.

Artificial neural networks are promising for general function approximation but challenging to train on non-independent or non-identically distributed data due to catastrophic forgetting. The experience replay buffer, a standard component in deep reinforcement learning, is often used to reduce forgetting and improve sample efficiency by storing experiences in a large buffer and using them for training later. However, a large replay buffer results in a heavy memory burden, especially for onboard and edge devices with limited memory capacities. We propose memory-efficient reinforcement learning algorithms based on the deep Q-network algorithm to alleviate this problem. Our algorithms reduce forgetting and maintain high sample efficiency by consolidating knowledge from the target Q-network to the current Q-network. Compared to baseline methods, our algorithms achieve comparable or better performance in both feature-based and image-based tasks while easing the burden of large experience replay buffers.

View on arXiv PDF Code

Similar