Semi-supervised Question Retrieval with Gated Convolutions
This addresses the challenge of automated question retrieval for forums, enabling reuse of existing answers, though it is incremental in improving neural methods for this specific domain.
The paper tackles the problem of finding semantically related questions in forums by developing a gated convolution model that maps questions to semantic representations, achieving substantial gains over standard IR baselines and various neural network architectures.
Question answering forums are rapidly growing in size with no effective automated ability to refer to and reuse answers already available for previous posted questions. In this paper, we develop a methodology for finding semantically related questions. The task is difficult since 1) key pieces of information are often buried in extraneous details in the question body and 2) available annotations on similar questions are scarce and fragmented. We design a recurrent and convolutional model (gated convolution) to effectively map questions to their semantic representations. The models are pre-trained within an encoder-decoder framework (from body to title) on the basis of the entire raw corpus, and fine-tuned discriminatively from limited annotations. Our evaluation demonstrates that our model yields substantial gains over a standard IR baseline and various neural network architectures (including CNNs, LSTMs and GRUs).