CL AIApr 29, 2020

SubjQA: A Dataset for Subjectivity and Review Comprehension

Johannes Bjerva, Nikita Bhutani, Behzad Golshan, Wang-Chiew Tan, Isabelle Augenstein

arXiv:2004.14283v331.21005 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses the lack of subjectivity investigation in QA for user-generated data, but it is incremental as it builds on existing work on subjectivity in NLP.

The paper tackled the problem of subjectivity in question answering (QA) by developing SubjQA, a dataset based on customer reviews with subjectivity annotations across 6 domains, and found that subjectivity is an important feature in QA with intricate interactions, such as subjective questions not always linking to subjective answers.

Subjectivity is the expression of internal opinions or beliefs which cannot be objectively observed or verified, and has been shown to be important for sentiment analysis and word-sense disambiguation. Furthermore, subjectivity is an important aspect of user-generated data. In spite of this, subjectivity has not been investigated in contexts where such data is widespread, such as in question answering (QA). We therefore investigate the relationship between subjectivity and QA, while developing a new dataset. We compare and contrast with analyses from previous work, and verify that findings regarding subjectivity still hold when using recently developed NLP architectures. We find that subjectivity is also an important feature in the case of QA, albeit with more intricate interactions between subjectivity and QA performance. For instance, a subjective question may or may not be associated with a subjective answer. We release an English QA dataset (SubjQA) based on customer reviews, containing subjectivity annotations for questions and answer spans across 6 distinct domains.

View on arXiv PDF Code

Similar