CLOct 29, 2018

ReviewQA: a relational aspect-based opinion reading dataset

arXiv:1810.12196v13 citations
Originality Synthesis-oriented
AI Analysis

This dataset addresses the need for more nuanced reasoning in QA systems for the NLP community, though it is incremental as it builds on existing QA frameworks with a new domain-specific focus.

The authors introduced ReviewQA, a large-scale question-answering dataset with over 500,000 questions based on 100,000 hotel reviews, designed to evaluate models on relational understanding competencies rather than simple span extraction. They established baselines to benchmark model performance across different tasks.

Deep reading models for question-answering have demonstrated promising performance over the last couple of years. However current systems tend to learn how to cleverly extract a span of the source document, based on its similarity with the question, instead of seeking for the appropriate answer. Indeed, a reading machine should be able to detect relevant passages in a document regarding a question, but more importantly, it should be able to reason over the important pieces of the document in order to produce an answer when it is required. To motivate this purpose, we present ReviewQA, a question-answering dataset based on hotel reviews. The questions of this dataset are linked to a set of relational understanding competencies that we expect a model to master. Indeed, each question comes with an associated type that characterizes the required competency. With this framework, it is possible to benchmark the main families of models and to get an overview of what are the strengths and the weaknesses of a given model on the set of tasks evaluated in this dataset. Our corpus contains more than 500.000 questions in natural language over 100.000 hotel reviews. Our setup is projective, the answer of a question does not need to be extracted from a document, like in most of the recent datasets, but selected among a set of candidates that contains all the possible answers to the questions of the dataset. Finally, we present several baselines over this dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes