IRAug 23, 2021

Sequence-to-Sequence Learning on Keywords for Efficient FAQ Retrieval

Sourav Dutta, Haytham Assem, Edward Burgin

arXiv:2108.10019v13.62 citations

Originality Incremental advance

AI Analysis

This work addresses efficient FAQ retrieval for enterprise chatbots and customer support, though it appears incremental as it builds on existing keyword and Seq2Seq techniques.

The paper tackles the problem of bridging lexical and semantic gaps in FAQ retrieval for chatbots and support systems by proposing TI-S2S, a framework combining keyword extraction and embeddings with Seq2Seq, achieving 92% precision-at-rank-5 and a 13% improvement over existing methods.

Frequently-Asked-Question (FAQ) retrieval provides an effective procedure for responding to user's natural language based queries. Such platforms are becoming common in enterprise chatbots, product question answering, and preliminary technical support for customers. However, the challenge in such scenarios lies in bridging the lexical and semantic gap between varied query formulations and the corresponding answers, both of which typically have a very short span. This paper proposes TI-S2S, a novel learning framework combining TF-IDF based keyword extraction and Word2Vec embeddings for training a Sequence-to-Sequence (Seq2Seq) architecture. It achieves high precision for FAQ retrieval by better understanding the underlying intent of a user question captured via the representative keywords. We further propose a variant with an additional neural network module for guiding retrieval via relevant candidate identification based on similarity features. Experiments on publicly available dataset depict our approaches to provide around 92% precision-at-rank-5, exhibiting nearly 13% improvement over existing approaches.

View on arXiv PDF

Similar