LGFeb 29, 2024

A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations

arXiv:2402.18836v13 citationsh-index: 41CDC
Originality Incremental advance
AI Analysis

This work addresses sample efficiency for reinforcement learning practitioners, presenting an incremental improvement by combining existing methods.

The paper tackles the problem of improving sample efficiency in deep reinforcement learning by incorporating expert observations without explicit action information, achieving superior performance on continuous control tasks compared to benchmarks.

This paper investigates how to incorporate expert observations (without explicit information on expert actions) into a deep reinforcement learning setting to improve sample efficiency. First, we formulate an augmented policy loss combining a maximum entropy reinforcement learning objective with a behavioral cloning loss that leverages a forward dynamics model. Then, we propose an algorithm that automatically adjusts the weights of each component in the augmented loss function. Experiments on a variety of continuous control tasks demonstrate that the proposed algorithm outperforms various benchmarks by effectively utilizing available expert observations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes