CLMay 15, 2018

Continuous Learning in a Hierarchical Multiscale Neural Network

Thomas Wolf, Julien Chaumond, Clement Delangue

arXiv:1805.05758v132.01095 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of catastrophic forgetting in continuous learning for language models, which is an incremental improvement in the domain of natural language processing.

The paper tackles the problem of encoding multi-scale sequence representations in language models by proposing a hierarchical multi-scale neural network that uses continuous learning to handle short and long time-scale dependencies, with results showing improved performance on language modeling benchmarks.

We reformulate the problem of encoding a multi-scale representation of a sequence in a language model by casting it in a continuous learning framework. We propose a hierarchical multi-scale language model in which short time-scale dependencies are encoded in the hidden state of a lower-level recurrent neural network while longer time-scale dependencies are encoded in the dynamic of the lower-level network by having a meta-learner update the weights of the lower-level neural network in an online meta-learning fashion. We use elastic weights consolidation as a higher-level to prevent catastrophic forgetting in our continuous learning framework.

View on arXiv PDF

Similar