RO CV HC LGJul 27, 2022

Learning to Assess Danger from Movies for Cooperative Escape Planning in Hazardous Environments

Vikram Shree, Sarah Allen, Beatriz Asfora, Jacopo Banfi, Mark Campbell

arXiv:2207.13791v14.03 citationsh-index: 15

Originality Incremental advance

AI Analysis

This addresses the challenge of training and testing robots for hazardous scenarios like fires or earthquakes, which are difficult to replicate in real life, by leveraging entertainment media and multi-modal data.

The paper tackles the problem of enabling robots to operate in hazardous environments by creating a dataset from movies/TV shows annotated with danger ratings and keywords, and developing a multi-modal danger estimation pipeline that fuses visual and language inputs with a risk-aware planner. The result is a higher success rate in collaborative human-robot escape missions, demonstrated through extensive simulations.

There has been a plethora of work towards improving robot perception and navigation, yet their application in hazardous environments, like during a fire or an earthquake, is still at a nascent stage. We hypothesize two key challenges here: first, it is difficult to replicate such scenarios in the real world, which is necessary for training and testing purposes. Second, current systems are not fully able to take advantage of the rich multi-modal data available in such hazardous environments. To address the first challenge, we propose to harness the enormous amount of visual content available in the form of movies and TV shows, and develop a dataset that can represent hazardous environments encountered in the real world. The data is annotated with high-level danger ratings for realistic disaster images, and corresponding keywords are provided that summarize the content of the scene. In response to the second challenge, we propose a multi-modal danger estimation pipeline for collaborative human-robot escape scenarios. Our Bayesian framework improves danger estimation by fusing information from robot's camera sensor and language inputs from the human. Furthermore, we augment the estimation module with a risk-aware planner that helps in identifying safer paths out of the dangerous environment. Through extensive simulations, we exhibit the advantages of our multi-modal perception framework that gets translated into tangible benefits such as higher success rate in a collaborative human-robot mission.

View on arXiv PDF

Similar