Extreme Classification for Answer Type Prediction in Question Answering
This addresses semantic answer type prediction for question answering systems, but it is incremental as it builds on existing XBERT methods.
The paper tackles the challenge of predicting knowledge graph types for natural language questions by improving the clustering stage in an extreme multi-label classification pipeline, achieving state-of-the-art results.
Semantic answer type prediction (SMART) is known to be a useful step towards effective question answering (QA) systems. The SMART task involves predicting the top-$k$ knowledge graph (KG) types for a given natural language question. This is challenging due to the large number of types in KGs. In this paper, we propose use of extreme multi-label classification using Transformer models (XBERT) by clustering KG types using structural and semantic features based on question text. We specifically improve the clustering stage of the XBERT pipeline using textual and structural features derived from KGs. We show that these features can improve end-to-end performance for the SMART task, and yield state-of-the-art results.