Explicit World Models for Reliable Human-Robot Collaboration
This addresses the challenge of robust and interpretable AI for human-robot interaction, though it appears incremental as it builds on existing concepts of world models and alignment.
The paper tackles the problem of achieving reliable human-robot collaboration in dynamic and ambiguous environments by proposing an explicit world model to align robot behaviors with human expectations, focusing on context and interaction rather than formal verification.
This paper addresses the topic of robustness under sensing noise, ambiguous instructions, and human-robot interaction. We take a radically different tack to the issue of reliable embodied AI: instead of focusing on formal verification methods aimed at achieving model predictability and robustness, we emphasise the dynamic, ambiguous and subjective nature of human-robot interactions that requires embodied AI systems to perceive, interpret, and respond to human intentions in a manner that is consistent, comprehensible and aligned with human expectations. We argue that when embodied agents operate in human environments that are inherently social, multimodal, and fluid, reliability is contextually determined and only has meaning in relation to the goals and expectations of humans involved in the interaction. This calls for a fundamentally different approach to achieving reliable embodied AI that is centred on building and updating an accessible "explicit world model" representing the common ground between human and AI, that is used to align robot behaviours with human expectations.