CLMay 27, 2023

How Good is Automatic Segmentation as a Multimodal Discourse Annotation Aid?

arXiv:2305.17350v1135 citations
Originality Synthesis-oriented
AI Analysis

This work highlights a practical challenge for researchers annotating multimodal discourse, showing that existing annotation schemes are inadequate for automatically segmented speech, which is incremental for improving annotation efficiency.

The researchers evaluated how well automatic speech segmentation tools (Google and OpenAI's Whisper) perform as aids for annotating collaborative problem-solving (CPS) in teams, finding minimal correspondence between automatically segmented speech and manually transcribed oracle utterances, and that automatic segmentation leads to inconsistent annotations requiring arbitrary judgments.

Collaborative problem solving (CPS) in teams is tightly coupled with the creation of shared meaning between participants in a situated, collaborative task. In this work, we assess the quality of different utterance segmentation techniques as an aid in annotating CPS. We (1) manually transcribe utterances in a dataset of triads collaboratively solving a problem involving dialogue and physical object manipulation, (2) annotate collaborative moves according to these gold-standard transcripts, and then (3) apply these annotations to utterances that have been automatically segmented using toolkits from Google and OpenAI's Whisper. We show that the oracle utterances have minimal correspondence to automatically segmented speech, and that automatically segmented speech using different segmentation methods is also inconsistent. We also show that annotating automatically segmented speech has distinct implications compared with annotating oracle utterances--since most annotation schemes are designed for oracle cases, when annotating automatically-segmented utterances, annotators must invoke other information to make arbitrary judgments which other annotators may not replicate. We conclude with a discussion of how future annotation specs can account for these needs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes