CL AI IR LGDec 2, 2019

SemEval-2017 Task 3: Community Question Answering

Preslav Nakov, Doris Hoogeveen, Lluís Màrquez, Alessandro Moschitti, Hamdy Mubarak, Timothy Baldwin, Karin Verspoor

arXiv:1912.00730v131.61140 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses benchmarking and evaluation challenges in community question answering for researchers, but it is incremental as it builds on previous tasks.

The paper describes SemEval-2017 Task 3 on Community Question Answering, which reran four subtasks from 2016 and added a new subtask for multi-domain duplicate detection, with 23 teams participating and achieving best MAP scores of 88.43, 47.22, 15.46, and 61.16 in subtasks A-D, outperforming baselines.

We describe SemEval-2017 Task 3 on Community Question Answering. This year, we reran the four subtasks from SemEval-2016:(A) Question-Comment Similarity,(B) Question-Question Similarity,(C) Question-External Comment Similarity, and (D) Rerank the correct answers for a new question in Arabic, providing all the data from 2015 and 2016 for training, and fresh data for testing. Additionally, we added a new subtask E in order to enable experimentation with Multi-domain Question Duplicate Detection in a larger-scale scenario, using StackExchange subforums. A total of 23 teams participated in the task, and submitted a total of 85 runs (36 primary and 49 contrastive) for subtasks A-D. Unfortunately, no teams participated in subtask E. A variety of approaches and features were used by the participating systems to address the different subtasks. The best systems achieved an official score (MAP) of 88.43, 47.22, 15.46, and 61.16 in subtasks A, B, C, and D, respectively. These scores are better than the baselines, especially for subtasks A-C.

View on arXiv PDF Code

Similar