CLJul 20, 2017

Large-Scale Goodness Polarity Lexicons for Community Question Answering

Todor Mihaylov, Daniel Belchev, Yasen Kiprov, Ivan Koychev, Preslav Nakov

arXiv:1707.06378v10.78 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of identifying high-quality answers in community forums, though it is incremental as it adapts an existing idea from sentiment analysis to a new domain.

The authors tackled the problem of ranking comments in community question answering by building a goodness polarity lexicon, analogous to sentiment lexicons, which improved ranking by 0.7 MAP points absolute and achieved state-of-the-art performance on SemEval-2016 Task 3.

We transfer a key idea from the field of sentiment analysis to a new domain: community question answering (cQA). The cQA task we are interested in is the following: given a question and a thread of comments, we want to re-rank the comments so that the ones that are good answers to the question would be ranked higher than the bad ones. We notice that good vs. bad comments use specific vocabulary and that one can often predict the goodness/badness of a comment even ignoring the question, based on the comment contents only. This leads us to the idea to build a good/bad polarity lexicon as an analogy to the positive/negative sentiment polarity lexicons, commonly used in sentiment analysis. In particular, we use pointwise mutual information in order to build large-scale goodness polarity lexicons in a semi-supervised manner starting with a small number of initial seeds. The evaluation results show an improvement of 0.7 MAP points absolute over a very strong baseline and state-of-the art performance on SemEval-2016 Task 3.

View on arXiv PDF

Similar