CL MLSep 8, 2019

Back to the Future -- Sequential Alignment of Text Representations

Johannes Bjerva, Wouter Kouw, Isabelle Augenstein

arXiv:1909.03464v30.31 citations

Originality Incremental advance

AI Analysis

This addresses the problem of temporal data drift in NLP for tasks like paper acceptance prediction or rumor stance detection, offering a practical solution due to low computational expense, though it appears incremental as it adapts alignment techniques from computer vision.

The paper tackles data drift caused by language evolution over time in sequential decision-making tasks by sequentially aligning learned representations, and shows that their method outperforms strong baselines across three challenging tasks with varying time-scales, linguistic units, and domains.

Language evolves over time in many ways relevant to natural language processing tasks. For example, recent occurrences of tokens 'BERT' and 'ELMO' in publications refer to neural network architectures rather than persons. This type of temporal signal is typically overlooked, but is important if one aims to deploy a machine learning model over an extended period of time. In particular, language evolution causes data drift between time-steps in sequential decision-making tasks. Examples of such tasks include prediction of paper acceptance for yearly conferences (regular intervals) or author stance prediction for rumours on Twitter (irregular intervals). Inspired by successes in computer vision, we tackle data drift by sequentially aligning learned representations. We evaluate on three challenging tasks varying in terms of time-scales, linguistic units, and domains. These tasks show our method outperforming several strong baselines, including using all available data. We argue that, due to its low computational expense, sequential alignment is a practical solution to dealing with language evolution.

View on arXiv PDF

Similar