CLAINov 21, 2024

Forecasting Future International Events: A Reliable Dataset for Text-Based Event Modeling

arXiv:2411.14042v124 citationsh-index: 10EMNLP
Originality Synthesis-oriented
AI Analysis

This addresses the lack of reliable datasets for text-based event prediction, which is crucial for applications in global policy and geopolitics, though it is incremental as it focuses on dataset creation rather than new prediction methods.

The paper tackles the problem of predicting future international events from text by introducing the WORLDREP dataset, which uses LLMs and expert validation to provide high-quality labels, and demonstrates its effectiveness through experiments.

Predicting future international events from textual information, such as news articles, has tremendous potential for applications in global policy, strategic decision-making, and geopolitics. However, existing datasets available for this task are often limited in quality, hindering the progress of related research. In this paper, we introduce WORLDREP (WORLD Relationship and Event Prediction), a novel dataset designed to address these limitations by leveraging the advanced reasoning capabilities of large-language models (LLMs). Our dataset features high-quality scoring labels generated through advanced prompt modeling and rigorously validated by domain experts in political science. We showcase the quality and utility of WORLDREP for real-world event prediction tasks, demonstrating its effectiveness through extensive experiments and analysis. Furthermore, we publicly release our dataset along with the full automation source code for data collection, labeling, and benchmarking, aiming to support and advance research in text-based event prediction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes