LGAIMLJun 1, 2018

Equivalence Between Wasserstein and Value-Aware Loss for Model-based Reinforcement Learning

arXiv:1806.01265v29 citations
Originality Incremental advance
AI Analysis

This provides a theoretical foundation for using Wasserstein metrics in model-based reinforcement learning, addressing a domain-specific problem for researchers and practitioners in RL.

The paper tackles the challenge of learning effective generative models in approximate model-based reinforcement learning by establishing an equivalence between the value-aware model learning (VAML) objective and the Wasserstein metric, showing that minimizing VAML is equivalent to minimizing Wasserstein distance.

Learning a generative model is a key component of model-based reinforcement learning. Though learning a good model in the tabular setting is a simple task, learning a useful model in the approximate setting is challenging. In this context, an important question is the loss function used for model learning as varying the loss function can have a remarkable impact on effectiveness of planning. Recently Farahmand et al. (2017) proposed a value-aware model learning (VAML) objective that captures the structure of value function during model learning. Using tools from Asadi et al. (2018), we show that minimizing the VAML objective is in fact equivalent to minimizing the Wasserstein metric. This equivalence improves our understanding of value-aware models, and also creates a theoretical foundation for applications of Wasserstein in model-based reinforcement~learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes