CLNov 11, 2022

MEE: A Novel Multilingual Event Extraction Dataset

Amir Pouran Ben Veyseh, Javid Ebrahimi, Franck Dernoncourt, Thien Huu Nguyen

arXiv:2211.05955v224.5299 citationsh-index: 41

Originality Synthesis-oriented

AI Analysis

This addresses the problem of limited resources for non-English event extraction, which is incremental as it provides a new dataset rather than a method.

The authors tackled the lack of high-quality multilingual datasets for event extraction by creating MEE, a dataset with over 50K event mentions across 8 languages, and conducted experiments to identify challenges and opportunities in this area.

Event Extraction (EE) is one of the fundamental tasks in Information Extraction (IE) that aims to recognize event mentions and their arguments (i.e., participants) from text. Due to its importance, extensive methods and resources have been developed for Event Extraction. However, one limitation of current research for EE involves the under-exploration for non-English languages in which the lack of high-quality multilingual EE datasets for model training and evaluation has been the main hindrance. To address this limitation, we propose a novel Multilingual Event Extraction dataset (MEE) that provides annotation for more than 50K event mentions in 8 typologically different languages. MEE comprehensively annotates data for entity mentions, event triggers and event arguments. We conduct extensive experiments on the proposed dataset to reveal challenges and opportunities for multilingual EE.

View on arXiv PDF

Similar