Text2Time: Transformer-based Article Time Period Prediction
This work addresses the incremental problem of time period prediction for news articles, useful for historical research, sentiment analysis, and media monitoring.
The authors tackled the problem of predicting the publication period of news articles from text, creating a dataset of over 350,000 articles and fine-tuning a BERT model, which achieved impressive results and outperformed a baseline for this relatively unexplored task.
The task of predicting the publication period of text documents, such as news articles, is an important but less studied problem in the field of natural language processing. Predicting the year of a news article can be useful in various contexts, such as historical research, sentiment analysis, and media monitoring. In this work, we investigate the problem of predicting the publication period of a text document, specifically a news article, based on its textual content. In order to do so, we created our own extensive labeled dataset of over 350,000 news articles published by The New York Times over six decades. In our approach, we use a pretrained BERT model fine-tuned for the task of text classification, specifically for time period prediction.This model exceeds our expectations and provides some very impressive results in terms of accurately classifying news articles into their respective publication decades. The results beat the performance of the baseline model for this relatively unexplored task of time prediction from text.