CLOct 31, 2022

Improving Temporal Generalization of Pre-trained Language Models with Lexical Semantic Change

arXiv:2210.17127v1299 citationsh-index: 44Has Code
Originality Incremental advance
AI Analysis

This work addresses the temporal misalignment issue in language models for NLP applications, offering an incremental improvement over current adaptation methods.

The paper tackles the problem of poor temporal generalization in pre-trained language models by linking it to lexical semantic change and proposes a lexical-level masking strategy for post-training. The method outperforms existing continual training approaches on two models and four datasets across two classification tasks.

Recent research has revealed that neural language models at scale suffer from poor temporal generalization capability, i.e., the language model pre-trained on static data from past years performs worse over time on emerging data. Existing methods mainly perform continual training to mitigate such a misalignment. While effective to some extent but is far from being addressed on both the language modeling and downstream tasks. In this paper, we empirically observe that temporal generalization is closely affiliated with lexical semantic change, which is one of the essential phenomena of natural languages. Based on this observation, we propose a simple yet effective lexical-level masking strategy to post-train a converged language model. Experiments on two pre-trained language models, two different classification tasks, and four benchmark datasets demonstrate the effectiveness of our proposed method over existing temporal adaptation methods, i.e., continual training with new data. Our code is available at \url{https://github.com/zhaochen0110/LMLM}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes