AI LG MLFeb 26, 2019

The Termination Critic

Anna Harutyunyan, Will Dabney, Diana Borsa, Nicolas Heess, Remi Munos, Doina Precup

arXiv:1902.09996v125.955 citations

Originality Highly original

AI Analysis

This work addresses the challenge of improving abstraction efficiency in reinforcement learning, offering a novel perspective that could enhance agent performance in complex environments.

The paper tackles the problem of autonomously discovering behavioral abstractions (options) for reinforcement learning agents by proposing an algorithm that focuses on optimizing termination conditions based on compressibility rather than control objectives, resulting in non-trivial and useful options for learning and planning.

In this work, we consider the problem of autonomously discovering behavioral abstractions, or options, for reinforcement learning agents. We propose an algorithm that focuses on the termination condition, as opposed to -- as is common -- the policy. The termination condition is usually trained to optimize a control objective: an option ought to terminate if another has better value. We offer a different, information-theoretic perspective, and propose that terminations should focus instead on the compressibility of the option's encoding -- arguably a key reason for using abstractions. To achieve this algorithmically, we leverage the classical options framework, and learn the option transition model as a "critic" for the termination condition. Using this model, we derive gradients that optimize the desired criteria. We show that the resulting options are non-trivial, intuitively meaningful, and useful for learning and planning.

View on arXiv PDF

Similar