Question Type Classification Methods Comparison
This is an incremental study that addresses the problem of question classification for researchers and practitioners in natural language processing.
The paper compared state-of-the-art methods for question classification, finding that a CNN model with five convolutional layers achieved the best accuracy of 90.7% on the TREC 10 test set.
The paper presents a comparative study of state-of-the-art approaches for question classification task: Logistic Regression, Convolutional Neural Networks (CNN), Long Short-Term Memory Network (LSTM) and Quasi-Recurrent Neural Networks (QRNN). All models use pre-trained GLoVe word embeddings and trained on human-labeled data. The best accuracy is achieved using CNN model with five convolutional layers and various kernel sizes stacked in parallel, followed by one fully connected layer. The model reached 90.7% accuracy on TREC 10 test set. All the model architectures in this paper were developed from scratch on PyTorch, in few cases based on reliable open-source implementation.