Grounded Language Understanding for Manipulation Instructions Using GAN-Based Classification
This work addresses a critical need for communicative domestic service robots by improving manipulation instruction understanding, though it appears incremental as it builds on existing GAN and dataset frameworks.
The study tackled the problem of grounded language understanding for domestic service robots, specifically estimating appropriate objects from short sentences with missing verbs, and demonstrated that the proposed GAN-based classifier outperformed baseline methods in quantitative evaluations.
The target task of this study is grounded language understanding for domestic service robots (DSRs). In particular, we focus on instruction understanding for short sentences where verbs are missing. This task is of critical importance to build communicative DSRs because manipulation is essential for DSRs. Existing instruction understanding methods usually estimate missing information only from non-grounded knowledge; therefore, whether the predicted action is physically executable or not was unclear. In this paper, we present a grounded instruction understanding method to estimate appropriate objects given an instruction and situation. We extend the Generative Adversarial Nets (GAN) and build a GAN-based classifier using latent representations. To quantitatively evaluate the proposed method, we have developed a data set based on the standard data set used for Visual QA. Experimental results have shown that the proposed method gives the better result than baseline methods.