LGAIJun 19, 2025

Energy-Based Transfer for Reinforcement Learning

arXiv:2506.16590v1h-index: 9
Originality Incremental advance
AI Analysis

This addresses sample efficiency issues for reinforcement learning practitioners in multi-task settings, though it is incremental as it builds on existing transfer learning approaches.

The paper tackles the problem of poor sample efficiency in reinforcement learning for multi-task or continual learning by proposing an energy-based transfer learning method that selectively uses teacher guidance based on out-of-distribution detection, resulting in improved sample efficiency and performance across tasks.

Reinforcement learning algorithms often suffer from poor sample efficiency, making them challenging to apply in multi-task or continual learning settings. Efficiency can be improved by transferring knowledge from a previously trained teacher policy to guide exploration in new but related tasks. However, if the new task sufficiently differs from the teacher's training task, the transferred guidance may be sub-optimal and bias exploration toward low-reward behaviors. We propose an energy-based transfer learning method that uses out-of-distribution detection to selectively issue guidance, enabling the teacher to intervene only in states within its training distribution. We theoretically show that energy scores reflect the teacher's state-visitation density and empirically demonstrate improved sample efficiency and performance across both single-task and multi-task settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes