LGAIMLOct 2, 2019

CWAE-IRL: Formulating a supervised approach to Inverse Reinforcement Learning problem

arXiv:1910.00584v1
Originality Incremental advance
AI Analysis

This addresses the challenge of learning reward functions without system dynamics knowledge, offering an efficient alternative for applications in robotics and AI, though it appears incremental as it builds on existing variational and auto-encoder methods.

The paper tackled the inverse reinforcement learning problem by proposing a novel variational inference approach using a conditional Wasserstein auto-encoder to infer reward functions from expert actions, showing effective learning in complex environments like objectworld and pendulum benchmarks.

Inverse reinforcement learning (IRL) is used to infer the reward function from the actions of an expert running a Markov Decision Process (MDP). A novel approach using variational inference for learning the reward function is proposed in this research. Using this technique, the intractable posterior distribution of the continuous latent variable (the reward function in this case) is analytically approximated to appear to be as close to the prior belief while trying to reconstruct the future state conditioned on the current state and action. The reward function is derived using a well-known deep generative model known as Conditional Variational Auto-encoder (CVAE) with Wasserstein loss function, thus referred to as Conditional Wasserstein Auto-encoder-IRL (CWAE-IRL), which can be analyzed as a combination of the backward and forward inference. This can then form an efficient alternative to the previous approaches to IRL while having no knowledge of the system dynamics of the agent. Experimental results on standard benchmarks such as objectworld and pendulum show that the proposed algorithm can effectively learn the latent reward function in complex, high-dimensional environments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes