CL LGJun 16, 2021

On the long-term learning ability of LSTM LMs

Wim Boes, Robbe Van Rompaey, Lyan Verwimp, Joris Pelemans, Hugo Van hamme, Patrick Wambacq

arXiv:2106.08927v10.21 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of understanding long-term dependencies in language models for NLP researchers, but it is incremental as it builds on existing LSTM and CBOW methods.

The study investigated the long-term learning ability of LSTM language models by evaluating a contextual extension based on CBOW for sentence- and discourse-level models, finding that sentence-level models with the extension performed comparably to vanilla discourse-level models, but the extension did not improve discourse-level models.

We inspect the long-term learning ability of Long Short-Term Memory language models (LSTM LMs) by evaluating a contextual extension based on the Continuous Bag-of-Words (CBOW) model for both sentence- and discourse-level LSTM LMs and by analyzing its performance. We evaluate on text and speech. Sentence-level models using the long-term contextual module perform comparably to vanilla discourse-level LSTM LMs. On the other hand, the extension does not provide gains for discourse-level models. These findings indicate that discourse-level LSTM LMs already rely on contextual information to perform long-term learning.

View on arXiv PDF

Similar