SAMPLE-HD: Simultaneous Action and Motion Planning Learning Environment
This provides a new environment for researchers in robotics and AI to study multi-modal understanding and manipulation, but it is incremental as it builds on existing simulation tools.
The authors tackled the problem of creating a simulation environment for interactive reasoning in manipulation tasks by developing SAMPLE-HD, which generates scenes with household objects, language instructions, and ground truth paths for training data.
Humans exhibit incredibly high levels of multi-modal understanding - combining visual cues with read, or heard knowledge comes easy to us and allows for very accurate interaction with the surrounding environment. Various simulation environments focus on providing data for tasks related to scene understanding, question answering, space exploration, visual navigation. In this work, we are providing a solution to encompass both, visual and behavioural aspects of simulation in a new environment for learning interactive reasoning in manipulation setup. SAMPLE-HD environment allows to generate various scenes composed of small household objects, to procedurally generate language instructions for manipulation, and to generate ground truth paths serving as training data.