CLMar 11

Temporal Text Classification with Large Language Models

arXiv:2603.11295v185.1h-index: 6Has Code
Predicted impact top 21% in CL · last 90 daysOriginality Synthesis-oriented
AI Analysis

This addresses the problem of automatic text dating for researchers and practitioners, but it is incremental as it applies existing LLMs to a new task.

This study tackled the problem of Temporal Text Classification (TTC) by evaluating large language models (LLMs) on dating texts, finding that proprietary models perform well with few-shot prompting and fine-tuning improves open-source models but they still lag behind proprietary ones.

Languages change over time. Computational models can be trained to recognize such changes enabling them to estimate the publication date of texts. Despite recent advancements in Large Language Models (LLMs), their performance on automatic dating of texts, also known as Temporal Text Classification (TTC), has not been explored. This study provides the first systematic evaluation of leading proprietary (Claude 3.5, GPT-4o, Gemini 1.5) and open-source (LLaMA 3.2, Gemma 2, Mistral, Nemotron 4) LLMs on TTC using three historical corpora, two in English and one in Portuguese. We test zero-shot and few-shot prompting, and fine-tuning settings. Our results indicate that proprietary models perform well, especially with few-shot prompting. They also indicate that fine-tuning substantially improves open-source models but that they still fail to match the performance delivered by proprietary LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes