Dialogue Object Search
This work addresses the challenge of developing more intelligent and collaborative robots for human-robot interaction, but it is incremental as it primarily introduces a new task and discusses challenges without demonstrating significant advancements.
The paper tackles the problem of enabling robots to search for objects while communicating with humans via dialogue, introducing the dialogue object search task where a robot interacts with a remote human to locate objects in environments like kitchens, and it presents examples from a pilot study without reporting concrete results or numbers.
We envision robots that can collaborate and communicate seamlessly with humans. It is necessary for such robots to decide both what to say and how to act, while interacting with humans. To this end, we introduce a new task, dialogue object search: A robot is tasked to search for a target object (e.g. fork) in a human environment (e.g., kitchen), while engaging in a "video call" with a remote human who has additional but inexact knowledge about the target's location. That is, the robot conducts speech-based dialogue with the human, while sharing the image from its mounted camera. This task is challenging at multiple levels, from data collection, algorithm and system development,to evaluation. Despite these challenges, we believe such a task blocks the path towards more intelligent and collaborative robots. In this extended abstract, we motivate and introduce the dialogue object search task and analyze examples collected from a pilot study. We then discuss our next steps and conclude with several challenges on which we hope to receive feedback.