ROLGMar 15, 2022

Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning

arXiv:2203.13733v236 citationsh-index: 41
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of open-ended training for embodied intelligent agents in robotics, though it is incremental in applying existing methods to a new domain.

The paper tackles the problem of training autonomous robots to assemble multi-part physical structures using a physics-based environment with magnet blocks, and finds that large-scale reinforcement learning with graph-based policies enables agents to generalize to unseen blueprints and operate reset-free without specific training.

Assembly of multi-part physical structures is both a valuable end product for autonomous robotics, as well as a valuable diagnostic task for open-ended training of embodied intelligent agents. We introduce a naturalistic physics-based environment with a set of connectable magnet blocks inspired by children's toy kits. The objective is to assemble blocks into a succession of target blueprints. Despite the simplicity of this objective, the compositional nature of building diverse blueprints from a set of blocks leads to an explosion of complexity in structures that agents encounter. Furthermore, assembly stresses agents' multi-step planning, physical reasoning, and bimanual coordination. We find that the combination of large-scale reinforcement learning and graph-based policies -- surprisingly without any additional complexity -- is an effective recipe for training agents that not only generalize to complex unseen blueprints in a zero-shot manner, but even operate in a reset-free setting without being trained to do so. Through extensive experiments, we highlight the importance of large-scale training, structured representations, contributions of multi-task vs. single-task learning, as well as the effects of curriculums, and discuss qualitative behaviors of trained agents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes