CLLGApr 16, 2019

Doc2EDAG: An End-to-End Document-level Framework for Chinese Financial Event Extraction

arXiv:1904.07535v21014 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the challenge of extracting events from documents in domains like finance, where arguments are dispersed, though it is incremental as it builds on existing event extraction methods.

The authors tackled the problem of document-level event extraction, where event arguments scatter across sentences, by proposing Doc2EDAG, an end-to-end model that generates an entity-based directed acyclic graph, and built a large-scale Chinese financial dataset to show its superiority over state-of-the-art methods.

Most existing event extraction (EE) methods merely extract event arguments within the sentence scope. However, such sentence-level EE methods struggle to handle soaring amounts of documents from emerging applications, such as finance, legislation, health, etc., where event arguments always scatter across different sentences, and even multiple such event mentions frequently co-exist in the same document. To address these challenges, we propose a novel end-to-end model, Doc2EDAG, which can generate an entity-based directed acyclic graph to fulfill the document-level EE (DEE) effectively. Moreover, we reformalize a DEE task with the no-trigger-words design to ease the document-level event labeling. To demonstrate the effectiveness of Doc2EDAG, we build a large-scale real-world dataset consisting of Chinese financial announcements with the challenges mentioned above. Extensive experiments with comprehensive analyses illustrate the superiority of Doc2EDAG over state-of-the-art methods. Data and codes can be found at https://github.com/dolphin-zs/Doc2EDAG.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes