Interactive Text2Pickup Network for Natural Language based Human-Robot Collaboration
This addresses the challenge of human-robot collaboration in handling vague instructions, though it appears incremental as it builds on existing methods for interaction and object estimation.
The paper tackles the problem of robots picking up objects based on ambiguous human language commands by proposing an Interactive Text2Pickup network that generates questions to resolve ambiguity, resulting in a 98.49% accuracy for unambiguous commands and a 1.94 times increase in accuracy for ambiguous commands after interaction.
In this paper, we propose the Interactive Text2Pickup (IT2P) network for human-robot collaboration which enables an effective interaction with a human user despite the ambiguity in user's commands. We focus on the task where a robot is expected to pick up an object instructed by a human, and to interact with the human when the given instruction is vague. The proposed network understands the command from the human user and estimates the position of the desired object first. To handle the inherent ambiguity in human language commands, a suitable question which can resolve the ambiguity is generated. The user's answer to the question is combined with the initial command and given back to the network, resulting in more accurate estimation. The experiment results show that given unambiguous commands, the proposed method can estimate the position of the requested object with an accuracy of 98.49% based on our test dataset. Given ambiguous language commands, we show that the accuracy of the pick up task increases by 1.94 times after incorporating the information obtained from the interaction.