Predicate Hierarchies Improve Few-Shot State Classification
This addresses the challenge of adapting robotic planning and manipulation to new environments with limited data, representing an incremental improvement through the integration of hierarchical structures.
The paper tackles the problem of state classification for objects and relations in robotics, which suffers from combinatorial complexity and poor generalization to novel environments with few examples, by proposing PHIER, a method that uses predicate hierarchies to achieve significant performance gains in few-shot and out-of-distribution scenarios, outperforming existing methods in simulated and real-world tasks.
State classification of objects and their relations is core to many long-horizon tasks, particularly in robot planning and manipulation. However, the combinatorial explosion of possible object-predicate combinations, coupled with the need to adapt to novel real-world environments, makes it a desideratum for state classification models to generalize to novel queries with few examples. To this end, we propose PHIER, which leverages predicate hierarchies to generalize effectively in few-shot scenarios. PHIER uses an object-centric scene encoder, self-supervised losses that infer semantic relations between predicates, and a hyperbolic distance metric that captures hierarchical structure; it learns a structured latent space of image-predicate pairs that guides reasoning over state classification queries. We evaluate PHIER in the CALVIN and BEHAVIOR robotic environments and show that PHIER significantly outperforms existing methods in few-shot, out-of-distribution state classification, and demonstrates strong zero- and few-shot generalization from simulated to real-world tasks. Our results demonstrate that leveraging predicate hierarchies improves performance on state classification tasks with limited data.