LGAIROApr 3, 2023

Chain-of-Thought Predictive Control

arXiv:2304.00776v224 citationsh-index: 33
Originality Incremental advance
AI Analysis

This addresses the problem of generalizable policy learning from imperfect demonstrations for robotics and AI control systems, representing an incremental advance in hierarchical imitation learning.

The paper tackles policy learning from sub-optimal demonstrations for complex low-level control tasks like object manipulation, proposing a hierarchical imitation learning method that discovers subskill decompositions and uses a Transformer-based design to predict a chain-of-thought for guidance, resulting in consistent outperformance over strong baselines on challenging tasks.

We study generalizable policy learning from demonstrations for complex low-level control (e.g., contact-rich object manipulations). We propose a novel hierarchical imitation learning method that utilizes sub-optimal demos. Firstly, we propose an observation space-agnostic approach that efficiently discovers the multi-step subskill decomposition of the demos in an unsupervised manner. By grouping temporarily close and functionally similar actions into subskill-level demo segments, the observations at the segment boundaries constitute a chain of planning steps for the task, which we refer to as the chain-of-thought (CoT). Next, we propose a Transformer-based design that effectively learns to predict the CoT as the subskill-level guidance. We couple action and subskill predictions via learnable prompt tokens and a hybrid masking strategy, which enable dynamically updated guidance at test time and improve feature representation of the trajectory for generalizable policy learning. Our method, Chain-of-Thought Predictive Control (CoTPC), consistently surpasses existing strong baselines on challenging manipulation tasks with sub-optimal demos.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes