CLLGMay 15, 2023

MeeQA: Natural Questions in Meeting Transcripts

arXiv:2305.08502v14 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of QA in meeting transcripts for researchers and practitioners, but it is incremental as it builds on existing QA methods with a new dataset and loss function.

The authors tackled the problem of natural-language question answering over meeting transcripts by introducing MeeQA, a dataset with 48K question-answer pairs from 422 meetings, and proposed a novel loss function to handle unanswered questions, showing improved performance over standard models.

We present MeeQA, a dataset for natural-language question answering over meeting transcripts. It includes real questions asked during meetings by its participants. The dataset contains 48K question-answer pairs, extracted from 422 meeting transcripts, spanning multiple domains. Questions in transcripts pose a special challenge as they are not always clear, and considerable context may be required in order to provide an answer. Further, many questions asked during meetings are left unanswered. To improve baseline model performance on this type of questions, we also propose a novel loss function, \emph{Flat Hierarchical Loss}, designed to enhance performance over questions with no answer in the text. Our experiments demonstrate the advantage of using our approach over standard QA models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes