ROAIApr 23, 2025

MOSAIC: A Skill-Centric Algorithmic Framework for Long-Horizon Manipulation Planning

arXiv:2504.16738v25 citationsh-index: 7
Originality Highly original
AI Analysis

This addresses the challenge of enabling general-purpose robots to handle novel tasks through flexible skill composition, representing a novel method for a known bottleneck in robotics.

The paper tackles the problem of planning long-horizon manipulation motions by introducing MOSAIC, a skill-centric algorithmic framework that efficiently discovers physically-grounded solutions using generators and connectors, demonstrating efficacy in complex tasks in simulation and real-world settings.

Planning long-horizon manipulation motions using a set of predefined skills is a central challenge in robotics; solving it efficiently could enable general-purpose robots to tackle novel tasks by flexibly composing generic skills. Solutions to this problem lie in an infinitely vast space of parameterized skill sequences -- a space where common incremental methods struggle to find sequences that have non-obvious intermediate steps. Some approaches reason over lower-dimensional, symbolic spaces, which are more tractable to explore but may be brittle and are laborious to construct. In this work, we introduce MOSAIC, a skill-centric, multi-directional planning approach that targets these challenges by reasoning about which skills to employ and where they are most likely to succeed, by utilizing physics simulation to estimate skill execution outcomes. Specifically, MOSAIC employs two complementary skill families: Generators, which identify ``islands of competence'' where skills are demonstrably effective, and Connectors, which link these skill-trajectories by solving boundary value problems. By focusing planning efforts on regions of high competence, MOSAIC efficiently discovers physically-grounded solutions. We demonstrate its efficacy on complex long-horizon problems in both simulation and the real world, using a diverse set of skills including generative diffusion models, motion planning algorithms, and manipulation-specific models. Visit skill-mosaic.github.io for demonstrations and examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes