LGAIJun 11, 2025

Self-Predictive Representations for Combinatorial Generalization in Behavioral Cloning

arXiv:2506.10137v27 citationsh-index: 12
Originality Incremental advance
AI Analysis

This addresses the limitation of poor zero-shot generalization to novel state-goal pairs in behavioral cloning, which is an incremental improvement for reinforcement learning applications.

The paper tackles the problem of combinatorial generalization in goal-conditioned behavior cloning by proposing a representation learning objective that encourages temporal consistency, achieving competitive performance on challenging tasks.

While goal-conditioned behavior cloning (GCBC) methods can perform well on in-distribution training tasks, they do not necessarily generalize zero-shot to tasks that require conditioning on novel state-goal pairs, i.e. combinatorial generalization. In part, this limitation can be attributed to a lack of temporal consistency in the state representation learned by BC; if temporally correlated states are properly encoded to similar latent representations, then the out-of-distribution gap for novel state-goal pairs would be reduced. We formalize this notion by demonstrating how encouraging long-range temporal consistency via successor representations (SR) can facilitate generalization. We then propose a simple yet effective representation learning objective, $\text{BYOL-}γ$ for GCBC, which theoretically approximates the successor representation in the finite MDP case through self-predictive representations, and achieves competitive empirical performance across a suite of challenging tasks requiring combinatorial generalization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes