Retrieving and Ranking Similar Questions from Question-Answer Archives Using Topic Modelling and Topic Distribution Regression
This addresses the challenge of vocabulary mismatch and length differences in QA archives, offering an incremental improvement for collaborative platforms.
The paper tackles the problem of ranking similar questions in QA platforms by integrating a regression stage to align topics from questions and question-answer pairs, outperforming translation methods and topic modelling without regression on real-world datasets.
Presented herein is a novel model for similar question ranking within collaborative question answer platforms. The presented approach integrates a regression stage to relate topics derived from questions to those derived from question-answer pairs. This helps to avoid problems caused by the differences in vocabulary used within questions and answers, and the tendency for questions to be shorter than answers. The performance of the model is shown to outperform translation methods and topic modelling (without regression) on several real-world datasets.