CLJan 29, 2018

A Corpus for Modeling Word Importance in Spoken Dialogue Transcripts

arXiv:1801.09746v31089 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for better metrics to evaluate automatic speech recognition (ASR) performance for deaf or hard-of-hearing users, though it is incremental as it builds on existing datasets and methods.

The authors tackled the problem of modeling word importance in spoken dialogue transcripts by augmenting the Switchboard corpus with word-importance annotations and training a model that achieved an F-score of 0.60 in a 6-class classification task with a concordance correlation coefficient of 0.839 against human annotators.

Motivated by a project to create a system for people who are deaf or hard-of-hearing that would use automatic speech recognition (ASR) to produce real-time text captions of spoken English during in-person meetings with hearing individuals, we have augmented a transcript of the Switchboard conversational dialogue corpus with an overlay of word-importance annotations, with a numeric score for each word, to indicate its importance to the meaning of each dialogue turn. Further, we demonstrate the utility of this corpus by training an automatic word importance labeling model; our best performing model has an F-score of 0.60 in an ordinal 6-class word-importance classification task with an agreement (concordance correlation coefficient) of 0.839 with the human annotators (agreement score between annotators is 0.89). Finally, we discuss our intended future applications of this resource, particularly for the task of evaluating ASR performance, i.e. creating metrics that predict ASR-output caption text usability for DHH users better thanWord Error Rate (WER).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes