Read My Mind: A Multi-Modal Dataset for Human Belief Prediction
This addresses the need for better human-robot collaboration by providing a dataset for evaluation, but it is incremental as it focuses on data creation rather than a novel method.
The authors tackled the problem of enabling AI systems to infer human beliefs for human-robot interaction by introducing a large-scale multi-modal video dataset for intent prediction based on object-context relations.
Understanding human intentions is key to enabling effective and efficient human-robot interaction (HRI) in collaborative settings. To enable developments and evaluation of the ability of artificial intelligence (AI) systems to infer human beliefs, we introduce a large-scale multi-modal video dataset for intent prediction based on object-context relations.