LGSYJul 29, 2023

Initial State Interventions for Deconfounded Imitation Learning

arXiv:2307.15980v33 citationsh-index: 30
Originality Incremental advance
AI Analysis

This work addresses the problem of causal confusion for imitation learning practitioners, offering a method to improve policy performance without additional expert input, though it is incremental as it builds on existing disentanglement approaches.

The paper tackles causal confusion in imitation learning by proposing a novel masking algorithm that leverages initial state interventions to mask observed confounders without requiring expert queries or causal graphs, and demonstrates its application in behavior cloning for CartPole and Reacher control systems with theoretical guarantees of conservatism.

Imitation learning suffers from causal confusion. This phenomenon occurs when learned policies attend to features that do not causally influence the expert actions but are instead spuriously correlated. Causally confused agents produce low open-loop supervised loss but poor closed-loop performance upon deployment. We consider the problem of masking observed confounders in a disentangled representation of the observation space. Our novel masking algorithm leverages the usual ability to intervene in the initial system state, avoiding any requirement involving expert querying, expert reward functions, or causal graph specification. Under certain assumptions, we theoretically prove that this algorithm is conservative in the sense that it does not incorrectly mask observations that causally influence the expert; furthermore, intervening on the initial state serves to strictly reduce excess conservatism. The masking algorithm is applied to behavior cloning for two illustrative control systems: CartPole and Reacher.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes