CLMay 18, 2021

DRILL: Dynamic Representations for Imbalanced Lifelong Learning

arXiv:2105.08445v27 citations
Originality Highly original
AI Analysis

This addresses the challenge of lifelong learning in NLP for applications with shifting data distributions, though it is incremental as it builds on existing BERT-based methods.

The paper tackles the problem of catastrophic forgetting in continual learning for NLP by introducing DRILL, a novel architecture that uses a self-organizing neural network to gate BERT representations, and it outperforms current methods in imbalanced, non-stationary data scenarios.

Continual or lifelong learning has been a long-standing challenge in machine learning to date, especially in natural language processing (NLP). Although state-of-the-art language models such as BERT have ushered in a new era in this field due to their outstanding performance in multitask learning scenarios, they suffer from forgetting when being exposed to a continuous stream of data with shifting data distributions. In this paper, we introduce DRILL, a novel continual learning architecture for open-domain text classification. DRILL leverages a biologically inspired self-organizing neural architecture to selectively gate latent language representations from BERT in a task-incremental manner. We demonstrate in our experiments that DRILL outperforms current methods in a realistic scenario of imbalanced, non-stationary data without prior knowledge about task boundaries. To the best of our knowledge, DRILL is the first of its kind to use a self-organizing neural architecture for open-domain lifelong learning in NLP.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes