ROApr 24

Policy Contrastive Decoding for Robotic Foundation Models

arXiv:2505.1325598.06 citationsh-index: 48Has Code
Predicted impact top 3% in RO · last 90 daysOriginality Incremental advance
AI Analysis

For robotic foundation model users, this work addresses the critical problem of spurious correlations limiting generalization, offering a practical plug-in solution.

The paper identifies that robotic foundation models learn spurious correlations from pre-training trajectories, harming generalization. It proposes Policy Contrastive Decoding (PCD), a training-free method that improves robot policies by contrasting action probabilities from original and object-masked inputs, achieving up to 108% improvement in real-world tasks.

Robotic foundation models, or generalist robot policies, hold immense potential to enable flexible, general-purpose and dexterous robotic systems. Despite their advancements, our empirical experiments reveal that existing robot policies are prone to learning spurious correlations from pre-training trajectories, adversely affecting their generalization capabilities beyond the training data. To tackle this, we propose a novel Policy Contrastive Decoding (PCD) approach, which redirects the robot policy's focus toward object-relevant visual clues by contrasting action probability distributions derived from original and object-masked visual inputs. As a training-free method, our PCD can be used as a plugin to improve different types of robot policies without needing to finetune or access model weights. We conduct extensive experiments on top of three open-source robot policies, including the autoregressive policy OpenVLA and the diffusion-based policies Octo and $π_0$. The obtained results in both simulation and real-world environments prove PCD's flexibility and effectiveness, e.g., PCD enhances the state-of-the-art policy $π_0$ by 8.9% in the simulation environment and by 108% in the real-world environment. Code and demos are publicly available at: https://koorye.github.io/PCD.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes