CLNov 8, 2019

An Annotation Scheme of A Large-scale Multi-party Dialogues Dataset for Discourse Parsing and Machine Comprehension

arXiv:1911.03514v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of understanding complex multi-party dialogues for researchers in natural language processing, though it is incremental as it builds on existing corpus data.

The authors tackled the lack of large-scale annotated datasets for multi-party dialogues by proposing an annotation scheme based on the Ubuntu Chat Corpus, resulting in the first such corpus for discourse parsing and machine reading comprehension tasks.

In this paper, we propose the scheme for annotating large-scale multi-party chat dialogues for discourse parsing and machine comprehension. The main goal of this project is to help understand multi-party dialogues. Our dataset is based on the Ubuntu Chat Corpus. For each multi-party dialogue, we annotate the discourse structure and question-answer pairs for dialogues. As we know, this is the first large scale corpus for multi-party dialogues discourse parsing, and we firstly propose the task for multi-party dialogues machine reading comprehension.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes