CL AI IRNov 26, 2019

SemEval-2015 Task 3: Answer Selection in Community Question Answering

Preslav Nakov, Lluís Màrquez, Walid Magdy, Alessandro Moschitti, James Glass, Bilal Randeree

arXiv:1911.11403v131.51137 citations

Originality Synthesis-oriented

AI Analysis

This addresses the need for standardized benchmarks in community question answering research, though it is incremental as it builds on existing QA frameworks.

The paper tackled the problem of answer selection in community question answering by organizing SemEval-2015 Task 3, which included subtasks for classifying answer quality and answering YES/NO questions in English and Arabic, resulting in best system scores of 57.19 and 63.7 F1 for English subtasks and 78.55 F1 for Arabic subtask A.

Community Question Answering (cQA) provides new interesting research directions to the traditional Question Answering (QA) field, e.g., the exploitation of the interaction between users and the structure of related posts. In this context, we organized SemEval-2015 Task 3 on "Answer Selection in cQA", which included two subtasks: (a) classifying answers as "good", "bad", or "potentially relevant" with respect to the question, and (b) answering a YES/NO question with "yes", "no", or "unsure", based on the list of all answers. We set subtask A for Arabic and English on two relatively different cQA domains, i.e., the Qatar Living website for English, and a Quran-related website for Arabic. We used crowdsourcing on Amazon Mechanical Turk to label a large English training dataset, which we released to the research community. Thirteen teams participated in the challenge with a total of 61 submissions: 24 primary and 37 contrastive. The best systems achieved an official score (macro-averaged F1) of 57.19 and 63.7 for the English subtasks A and B, and 78.55 for the Arabic subtask A.

View on arXiv PDF

Similar