LGMLJul 17, 2019

Low-Shot Classification: A Comparison of Classical and Deep Transfer Machine Learning Approaches

arXiv:1907.07543v114 citations
Originality Incremental advance
AI Analysis

This work addresses the choice of machine learning paradigm for practitioners in NLP dealing with limited labeled data, providing quantitative evidence that deep transfer learning is superior, though it is incremental as it builds on existing models like BERT.

The paper compares deep transfer learning (BERT) with classical machine learning for low-shot text classification, finding that BERT outperforms classical methods by 9.7% with 100 examples per class and shows greater robustness across domains with accuracy losses up to 3.2% versus 20.6% for classical methods.

Despite the recent success of deep transfer learning approaches in NLP, there is a lack of quantitative studies demonstrating the gains these models offer in low-shot text classification tasks over existing paradigms. Deep transfer learning approaches such as BERT and ULMFiT demonstrate that they can beat state-of-the-art results on larger datasets, however when one has only 100-1000 labelled examples per class, the choice of approach is less clear, with classical machine learning and deep transfer learning representing valid options. This paper compares the current best transfer learning approach with top classical machine learning approaches on a trinary sentiment classification task to assess the best paradigm. We find that BERT, representing the best of deep transfer learning, is the best performing approach, outperforming top classical machine learning algorithms by 9.7% on average when trained with 100 examples per class, narrowing to 1.8% at 1000 labels per class. We also show the robustness of deep transfer learning in moving across domains, where the maximum loss in accuracy is only 0.7% in similar domain tasks and 3.2% cross domain, compared to classical machine learning which loses up to 20.6%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes