Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retrieval in Asymmetric Texts
This addresses the challenge of term mismatch and asymmetric lengths in text pairs for ticketing systems, representing an incremental improvement with specific gains.
The paper tackled the problem of retrieving relevant solutions from historical tickets in an industrial ticketing system by learning similarity in asymmetric text pairs, achieving a 22% gain over unsupervised baselines and a 7% gain over supervised baselines in Accuracy@10 for retrieval.
The goal of our industrial ticketing system is to retrieve a relevant solution for an input query, by matching with historical tickets stored in knowledge base. A query is comprised of subject and description, while a historical ticket consists of subject, description and solution. To retrieve a relevant solution, we use textual similarity paradigm to learn similarity in the query and historical tickets. The task is challenging due to significant term mismatch in the query and ticket pairs of asymmetric lengths, where subject is a short text but description and solution are multi-sentence texts. We present a novel Replicated Siamese LSTM model to learn similarity in asymmetric text pairs, that gives 22% and 7% gain (Accuracy@10) for retrieval task, respectively over unsupervised and supervised baselines. We also show that the topic and distributed semantic features for short and long texts improved both similarity learning and retrieval.