AISep 7, 2016

Unifying task specification in reinforcement learning

arXiv:1609.01995v496 citations

Originality Incremental advance

AI Analysis

This work provides a foundational formalism for RL that simplifies algorithm development and theoretical analysis, though it is incremental in extending existing frameworks.

The paper tackles the lack of modularity in specifying reinforcement learning tasks by introducing the RL task formalism, which unifies task specifications through constructs like transition-based discounting and extends theoretical results such as approximation error bounds.

Reinforcement learning tasks are typically specified as Markov decision processes. This formalism has been highly successful, though specifications often couple the dynamics of the environment and the learning objective. This lack of modularity can complicate generalization of the task specification, as well as obfuscate connections between different task settings, such as episodic and continuing. In this work, we introduce the RL task formalism, that provides a unification through simple constructs including a generalization to transition-based discounting. Through a series of examples, we demonstrate the generality and utility of this formalism. Finally, we extend standard learning constructs, including Bellman operators, and extend some seminal theoretical results, including approximation errors bounds. Overall, we provide a well-understood and sound formalism on which to build theoretical results and simplify algorithm use and development.

View on arXiv PDF

Similar