CVAICLMar 23, 2021

PanGEA: The Panoramic Graph Environment Annotation Toolkit

arXiv:2103.12703v1728 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This toolkit addresses the problem of efficient data annotation for grounded language tasks, though it is incremental as it builds on existing annotation methods.

PanGEA is a lightweight toolkit for collecting speech and text annotations in photo-realistic 3D environments, used in a 20,000-hour effort to create the Room-Across-Room dataset.

PanGEA, the Panoramic Graph Environment Annotation toolkit, is a lightweight toolkit for collecting speech and text annotations in photo-realistic 3D environments. PanGEA immerses annotators in a web-based simulation and allows them to move around easily as they speak and/or listen. It includes database and cloud storage integration, plus utilities for automatically aligning recorded speech with manual transcriptions and the virtual pose of the annotators. Out of the box, PanGEA supports two tasks -- collecting navigation instructions and navigation instruction following -- and it could be easily adapted for annotating walking tours, finding and labeling landmarks or objects, and similar tasks. We share best practices learned from using PanGEA in a 20,000 hour annotation effort to collect the Room-Across-Room dataset. We hope that our open-source annotation toolkit and insights will both expedite future data collection efforts and spur innovation on the kinds of grounded language tasks such environments can support.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes