CL AIMay 22, 2023

MAILEX: Email Event and Argument Extraction

Saurabh Srivastava, Gaurav Singh, Shou Matsumoto, Ali Raz, Paulo Costa, Joshua Poore, Ziyu Yao

arXiv:2305.13469v221.3135 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This addresses the problem of extracting structured events from email conversations for domain-specific NLP applications, though it is incremental as it adapts existing extraction methods to a new domain.

The authors created MailEx, the first dataset for event extraction from conversational email threads, containing 1.5K threads with ~8K event instances, and found that current approaches struggle with challenges like non-continuous triggers and non-named entity arguments.

In this work, we present the first dataset, MailEx, for performing event extraction from conversational email threads. To this end, we first proposed a new taxonomy covering 10 event types and 76 arguments in the email domain. Our final dataset includes 1.5K email threads and ~4K emails, which are annotated with totally ~8K event instances. To understand the task challenges, we conducted a series of experiments comparing three types of approaches, i.e., fine-tuned sequence labeling, fine-tuned generative extraction, and few-shot in-context learning. Our results showed that the task of email event extraction is far from being addressed, due to challenges lying in, e.g., extracting non-continuous, shared trigger spans, extracting non-named entity arguments, and modeling the email conversational history. Our work thus suggests more future investigations in this domain-specific event extraction task.

View on arXiv PDF Code

Similar