Prospection: Interpretable Plans From Language By Predicting the Future
This addresses the challenge of making robots more interpretable and useful in real-world tasks by improving their ability to reason over implicit steps in human instructions.
The paper tackles the problem of enabling robots to interpret high-level natural language commands by converting them into sequences of intermediate goals, using a framework that incorporates prospection to predict action consequences. It demonstrates the fidelity of generated plans in simulated scenes with real, crowd-sourced commands.
High-level human instructions often correspond to behaviors with multiple implicit steps. In order for robots to be useful in the real world, they must be able to to reason over both motions and intermediate goals implied by human instructions. In this work, we propose a framework for learning representations that convert from a natural-language command to a sequence of intermediate goals for execution on a robot. A key feature of this framework is prospection, training an agent not just to correctly execute the prescribed command, but to predict a horizon of consequences of an action before taking it. We demonstrate the fidelity of plans generated by our framework when interpreting real, crowd-sourced natural language commands for a robot in simulated scenes.