LG MLDec 5, 2019

Hindsight Credit Assignment

Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Azar, Bilal Piot, Nicolas Heess, Hado van Hasselt, Greg Wayne, Satinder Singh, Doina Precup, Remi Munos

arXiv:1912.02503v122.991 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses credit assignment challenges in reinforcement learning, but appears incremental as it builds on existing value function concepts.

The paper tackles the problem of efficient credit assignment in reinforcement learning by proposing a method that assigns credit to past decisions based on the likelihood they led to observed outcomes, using hindsight instead of foresight, and shows that this yields a new family of algorithms that successfully address credit assignment challenges in illustrative tasks.

We consider the problem of efficient credit assignment in reinforcement learning. In order to efficiently and meaningfully utilize new data, we propose to explicitly assign credit to past decisions based on the likelihood of them having led to the observed outcome. This approach uses new information in hindsight, rather than employing foresight. Somewhat surprisingly, we show that value functions can be rewritten through this lens, yielding a new family of algorithms. We study the properties of these algorithms, and empirically show that they successfully address important credit assignment challenges, through a set of illustrative tasks.

View on arXiv PDF Code

Similar