CL IR LGAug 27, 2018

A strong baseline for question relevancy ranking

Ana V. González-Garduño, Isabelle Augenstein, Anders Søgaard

arXiv:1808.08836v132.01093 citations

Originality Incremental advance

AI Analysis

This provides a strong, efficient baseline for a specific NLP task, but it is incremental as it builds on existing methods without a major paradigm shift.

The paper tackled the problem of question relevancy ranking in community question answering by introducing a simple multi-task feed forward network trained on distance measures, which outperformed the best existing systems from SemEval shared tasks.

The best systems at the SemEval-16 and SemEval-17 community question answering shared tasks -- a task that amounts to question relevancy ranking -- involve complex pipelines and manual feature engineering. Despite this, many of these still fail at beating the IR baseline, i.e., the rankings provided by Google's search engine. We present a strong baseline for question relevancy ranking by training a simple multi-task feed forward network on a bag of 14 distance measures for the input question pair. This baseline model, which is fast to train and uses only language-independent features, outperforms the best shared task systems on the task of retrieving relevant previously asked questions.

View on arXiv PDF

Similar