LGCLMLNov 26, 2019

Semi-Supervised Learning for Text Classification by Layer Partitioning

arXiv:1911.11756v112 citations
Originality Incremental advance
AI Analysis

This addresses the problem of applying semi-supervised learning to text classification, especially for short texts, but is incremental as it adapts existing methods rather than introducing a new paradigm.

The paper tackled adapting semi-supervised learning methods from continuous to discrete text inputs by partitioning neural networks into frozen and trainable components, resulting in improved performance over state-of-the-art methods, particularly on short texts.

Most recent neural semi-supervised learning algorithms rely on adding small perturbation to either the input vectors or their representations. These methods have been successful on computer vision tasks as the images form a continuous manifold, but are not appropriate for discrete input such as sentence. To adapt these methods to text input, we propose to decompose a neural network $M$ into two components $F$ and $U$ so that $M = U\circ F$. The layers in $F$ are then frozen and only the layers in $U$ will be updated during most time of the training. In this way, $F$ serves as a feature extractor that maps the input to high-level representation and adds systematical noise using dropout. We can then train $U$ using any state-of-the-art SSL algorithms such as $Π$-model, temporal ensembling, mean teacher, etc. Furthermore, this gradually unfreezing schedule also prevents a pretrained model from catastrophic forgetting. The experimental results demonstrate that our approach provides improvements when compared to state of the art methods especially on short texts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes