CLROMar 13, 2022

Summarizing a virtual robot's past actions in natural language

arXiv:2203.06671v14 citationsh-index: 10
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of improving human-robot interaction by enabling robots to summarize their actions in natural language, though it is incremental as it repurposes an existing dataset and tests basic methods.

The paper introduces the task of generating natural language summaries of a virtual robot's actions, proposing methods that use either video frames or text representations from a planner, and provides quantitative and qualitative evaluations to establish a baseline for future research.

We propose and demonstrate the task of giving natural language summaries of the actions of a robotic agent in a virtual environment. We explain why such a task is important, what makes it difficult, and discuss how it might be addressed. To encourage others to work on this, we show how a popular existing dataset that matches robot actions with natural language descriptions designed for an instruction following task can be repurposed to serve as a training ground for robot action summarization work. We propose and test several methods of learning to generate such summaries, starting from either egocentric video frames of the robot taking actions or intermediate text representations of the actions used by an automatic planner. We provide quantitative and qualitative evaluations of our results, which can serve as a baseline for future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes