Model Blending for Text Classification
This work addresses efficiency issues for deploying models in resource-constrained environments, but it is incremental as it applies existing distillation techniques to a specific domain.
The paper tackles the problem of computationally expensive and memory-intensive deep neural networks by reducing the complexity of state-of-the-art LSTM models for text classification through knowledge distillation to CNN-based models, resulting in reduced inference time or latency during testing.
Deep neural networks (DNNs) have proven successful in a wide variety of applications such as speech recognition and synthesis, computer vision, machine translation, and game playing, to name but a few. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance, which is what we call reducing the complexity. In the following work, we try reducing the complexity of state of the art LSTM models for natural language tasks such as text classification, by distilling their knowledge to CNN based models, thus reducing the inference time(or latency) during testing.