LGAIMay 22, 2022

Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation

arXiv:2205.10868v510 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses memory constraints for reinforcement learning on edge devices, but it is incremental as it builds on existing deep Q-network methods.

The paper tackles the problem of catastrophic forgetting and high memory usage in deep reinforcement learning by proposing memory-efficient algorithms that consolidate knowledge from target to current Q-networks, achieving comparable or better performance in feature-based and image-based tasks while reducing reliance on large replay buffers.

Artificial neural networks are promising for general function approximation but challenging to train on non-independent or non-identically distributed data due to catastrophic forgetting. The experience replay buffer, a standard component in deep reinforcement learning, is often used to reduce forgetting and improve sample efficiency by storing experiences in a large buffer and using them for training later. However, a large replay buffer results in a heavy memory burden, especially for onboard and edge devices with limited memory capacities. We propose memory-efficient reinforcement learning algorithms based on the deep Q-network algorithm to alleviate this problem. Our algorithms reduce forgetting and maintain high sample efficiency by consolidating knowledge from the target Q-network to the current Q-network. Compared to baseline methods, our algorithms achieve comparable or better performance in both feature-based and image-based tasks while easing the burden of large experience replay buffers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes