LGMLFeb 8, 2024

Learning Uncertainty-Aware Temporally-Extended Actions

arXiv:2402.05439v13 citationsh-index: 4AAAI
Originality Incremental advance
AI Analysis

This addresses a specific limitation in reinforcement learning for policy learning efficiency, but it is incremental as it builds on existing action repetition techniques.

The paper tackles the problem of action repetition in reinforcement learning, which can degrade performance when sub-optimal actions are repeated, by proposing the Uncertainty-aware Temporal Extension (UTE) algorithm that uses ensemble methods to measure uncertainty, and it outperforms existing algorithms in Gridworld and Atari 2600 environments.

In reinforcement learning, temporal abstraction in the action space, exemplified by action repetition, is a technique to facilitate policy learning through extended actions. However, a primary limitation in previous studies of action repetition is its potential to degrade performance, particularly when sub-optimal actions are repeated. This issue often negates the advantages of action repetition. To address this, we propose a novel algorithm named Uncertainty-aware Temporal Extension (UTE). UTE employs ensemble methods to accurately measure uncertainty during action extension. This feature allows policies to strategically choose between emphasizing exploration or adopting an uncertainty-averse approach, tailored to their specific needs. We demonstrate the effectiveness of UTE through experiments in Gridworld and Atari 2600 environments. Our findings show that UTE outperforms existing action repetition algorithms, effectively mitigating their inherent limitations and significantly enhancing policy learning efficiency.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes