LGMLApr 29, 2023

A Coupled Flow Approach to Imitation Learning

arXiv:2305.00303v116 citationsh-index: 70
Originality Highly original
AI Analysis

This work addresses a fundamental bottleneck in reinforcement and imitation learning for researchers and practitioners, offering a novel approach to distribution matching.

The paper tackles the challenge of explicitly modeling state distributions in imitation learning by introducing a normalizing flow-based method, achieving state-of-the-art performance on benchmark tasks with a single expert trajectory.

In reinforcement learning and imitation learning, an object of central importance is the state distribution induced by the policy. It plays a crucial role in the policy gradient theorem, and references to it--along with the related state-action distribution--can be found all across the literature. Despite its importance, the state distribution is mostly discussed indirectly and theoretically, rather than being modeled explicitly. The reason being an absence of appropriate density estimation tools. In this work, we investigate applications of a normalizing flow-based model for the aforementioned distributions. In particular, we use a pair of flows coupled through the optimality point of the Donsker-Varadhan representation of the Kullback-Leibler (KL) divergence, for distribution matching based imitation learning. Our algorithm, Coupled Flow Imitation Learning (CFIL), achieves state-of-the-art performance on benchmark tasks with a single expert trajectory and extends naturally to a variety of other settings, including the subsampled and state-only regimes.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes