ROMay 8

BrickCraft: Visuomotor Skill Composition with Situated Manual Guidance for Long-Horizon Interlocking Brick Assembly

arXiv:2605.0760575.6
Predicted impact top 20% in RO · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the challenge of long-horizon, generalizable robotic assembly for interlocking bricks, a domain-specific problem with potential applications in construction and manufacturing.

BrickCraft introduces a compositional framework for autonomous robotic assembly of interlocking bricks, enabling long-horizon task execution by decomposing assembly into reusable primitive skills with situated manual guidance. The system achieves strong compositional generalization to unseen structures from limited demonstrations.

Autonomous robotic assembly of interlocking bricks demands seamless integration of long-horizon task reasoning, spatial grounding, and fine-grained manipulation. This paper presents BrickCraft, a compositional framework designed for long-horizon and generalizable interlocking brick assembly. BrickCraft models the assembly process using a relative formulation, where each step is anchored to a reference brick within the partial structure, thereby decomposing complex tasks into a finite set of reusable primitive skills. BrickCraft bridges the gap between high-level assembly plans and physical execution through situated manuals, which provide explicit spatial guidance for learned visuomotor skills by projecting the assembly intent onto real-time robot observations. Finally, BrickCraft employs a compositional execution pipeline that chains these spatially grounded skills to accomplish long-horizon assembly tasks. Extensive experimental validations demonstrate that BrickCraft acquires proficient assembly skills from a limited set of demonstrations and exhibits strong compositional generalization to unseen structures. The project website is available at https://intelligent-control-lab.github.io/BrickCraft.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes