LGMLApr 15, 2019

Disentangling Options with Hellinger Distance Regularizer

arXiv:1904.06887v1
Originality Incremental advance
AI Analysis

This work addresses a specific issue in temporal abstraction for RL researchers, but it appears incremental as it builds directly on the option-critic architecture.

The paper tackles the problem of ensuring that learned options in reinforcement learning are mutually exclusive by proposing a Hellinger distance regularizer, and it compares this method with existing approaches using statistical indicators.

In reinforcement learning (RL), temporal abstraction still remains as an important and unsolved problem. The options framework provided clues to temporal abstraction in the RL, and the option-critic architecture elegantly solved the two problems of finding options and learning RL agents in an end-to-end manner. However, it is necessary to examine whether the options learned through this method play a mutually exclusive role. In this paper, we propose a Hellinger distance regularizer, a method for disentangling options. In addition, we will shed light on various indicators from the statistical point of view to compare with the options learned through the existing option-critic architecture.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes