Robot Planning and Situation Handling with Active Perception
For autonomous robots operating in dynamic environments, this framework addresses the challenge of handling execution-time failures without human intervention.
VAP-TAMP enables robots to detect and handle unforeseen situations during plan execution by using action knowledge to prompt vision-language models for active perception and scene graph reasoning. In simulation and on a mobile manipulator, it successfully handled situations like jamming doors and fallen objects, improving task completion rates.
Current robots are capable of computing plans to accomplish complex tasks. However, real-world environments are inherently open and dynamic, and unforeseen situations frequently arise during plan execution, such as jamming doors and fallen objects on the floor. These situations may result from the robot's own action failures or from external disturbances, such as human activities. Detecting and handling such execution - time situations remains a significant challenge, limiting those robots' ability to achieve long-term autonomy. In this paper, we develop a planning and situation-handling framework, called VAP-TAMP, that enables robots to actively perceive and address unforeseen situations during plan execution. VAP-TAMP leverages action knowledge to strategically prompt vision-language models for active view selection and situation assessment, while constructing and reasoning over scene graphs for integrated task and motion planning. We evaluated VAP-TAMP using service tasks in simulation and on a mobile manipulation platform.