LGAIAug 5, 2022

Model Blending for Text Classification

arXiv:2208.02819v11 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses efficiency issues for deploying models in resource-constrained environments, but it is incremental as it applies existing distillation techniques to a specific domain.

The paper tackles the problem of computationally expensive and memory-intensive deep neural networks by reducing the complexity of state-of-the-art LSTM models for text classification through knowledge distillation to CNN-based models, resulting in reduced inference time or latency during testing.

Deep neural networks (DNNs) have proven successful in a wide variety of applications such as speech recognition and synthesis, computer vision, machine translation, and game playing, to name but a few. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance, which is what we call reducing the complexity. In the following work, we try reducing the complexity of state of the art LSTM models for natural language tasks such as text classification, by distilling their knowledge to CNN based models, thus reducing the inference time(or latency) during testing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes