GRAIROAug 28, 2021

DASH: Modularized Human Manipulation Simulation with Vision and Language for Embodied AI

arXiv:2108.12536v11 citations
Originality Incremental advance
AI Analysis

This provides an integrated simulation platform for scientific and engineering applications, though it is incremental as it builds on existing modular and simulation techniques.

The authors tackled the problem of creating a virtual human for embodied AI that can perform grasp-and-stack tasks in cluttered environments using natural language commands, achieving a high success rate with fluid and diverse motions under anthropomorphic constraints.

Creating virtual humans with embodied, human-like perceptual and actuation constraints has the promise to provide an integrated simulation platform for many scientific and engineering applications. We present Dynamic and Autonomous Simulated Human (DASH), an embodied virtual human that, given natural language commands, performs grasp-and-stack tasks in a physically-simulated cluttered environment solely using its own visual perception, proprioception, and touch, without requiring human motion data. By factoring the DASH system into a vision module, a language module, and manipulation modules of two skill categories, we can mix and match analytical and machine learning techniques for different modules so that DASH is able to not only perform randomly arranged tasks with a high success rate, but also do so under anthropomorphic constraints and with fluid and diverse motions. The modular design also favors analysis and extensibility to more complex manipulation skills.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes