LG MLApr 15, 2019

Disentangling Options with Hellinger Distance Regularizer

arXiv:1904.06887v11.0

Originality Incremental advance

AI Analysis

This work addresses a specific issue in temporal abstraction for RL researchers, but it appears incremental as it builds directly on the option-critic architecture.

The paper tackles the problem of ensuring that learned options in reinforcement learning are mutually exclusive by proposing a Hellinger distance regularizer, and it compares this method with existing approaches using statistical indicators.

In reinforcement learning (RL), temporal abstraction still remains as an important and unsolved problem. The options framework provided clues to temporal abstraction in the RL, and the option-critic architecture elegantly solved the two problems of finding options and learning RL agents in an end-to-end manner. However, it is necessary to examine whether the options learned through this method play a mutually exclusive role. In this paper, we propose a Hellinger distance regularizer, a method for disentangling options. In addition, we will shed light on various indicators from the statistical point of view to compare with the options learned through the existing option-critic architecture.

View on arXiv PDF

Similar