CEHA: A Dataset of Conflict Events in the Horn of Africa
This provides a new benchmark dataset for NLP analysis of conflict events in the Horn of Africa, addressing a gap in fine-grained labeling for humanitarian stakeholders.
The authors tackled the problem of limited fine-grained conflict event data for the Horn of Africa by introducing CEHA, a dataset of 500 English event descriptions with detailed event-type definitions, and demonstrated its utility through baseline experiments on event classification tasks.
Natural Language Processing (NLP) of news articles can play an important role in understanding the dynamics and causes of violent conflict. Despite the availability of datasets categorizing various conflict events, the existing labels often do not cover all of the fine-grained violent conflict event types relevant to areas like the Horn of Africa. In this paper, we introduce a new benchmark dataset Conflict Events in the Horn of Africa region (CEHA) and propose a new task for identifying violent conflict events using online resources with this dataset. The dataset consists of 500 English event descriptions regarding conflict events in the Horn of Africa region with fine-grained event-type definitions that emphasize the cause of the conflict. This dataset categorizes the key types of conflict risk according to specific areas required by stakeholders in the Humanitarian-Peace-Development Nexus. Additionally, we conduct extensive experiments on two tasks supported by this dataset: Event-relevance Classification and Event-type Classification. Our baseline models demonstrate the challenging nature of these tasks and the usefulness of our dataset for model evaluations in low-resource settings with limited number of training data.