Incorporating Annotator Uncertainty into Representations of Discourse Relations
This addresses annotation challenges for discourse analysis researchers, but it is incremental as it builds on existing methods for handling uncertainty.
The paper tackled the difficulty of annotating discourse relations by analyzing novice annotators' uncertainty in spoken conversational data, finding that dialogue context predicts confidence scores and incorporating this into representations coherently models uncertainty.
Annotation of discourse relations is a known difficult task, especially for non-expert annotators. In this paper, we investigate novice annotators' uncertainty on the annotation of discourse relations on spoken conversational data. We find that dialogue context (single turn, pair of turns within speaker, and pair of turns across speakers) is a significant predictor of confidence scores. We compute distributed representations of discourse relations from co-occurrence statistics that incorporate information about confidence scores and dialogue context. We perform a hierarchical clustering analysis using these representations and show that weighting discourse relation representations with information about confidence and dialogue context coherently models our annotators' uncertainty about discourse relation labels.