CLJun 30, 2021

HySPA: Hybrid Span Generation for Scalable Text-to-Graph Extraction

Liliang Ren, Chenkai Sun, Heng Ji, Julia Hockenmaier

arXiv:2106.15838v131.5714 citationsHas Code

Originality Highly original

AI Analysis

This addresses the problem of inefficient scaling for long texts in information extraction, offering a more scalable solution for researchers and practitioners.

The paper tackles the scalability issue in text-to-graph extraction by proposing HySPA, a hybrid span generator that maps graphs to sequences and decodes them linearly, achieving significant performance improvements on the ACE05 dataset.

Text-to-Graph extraction aims to automatically extract information graphs consisting of mentions and types from natural language texts. Existing approaches, such as table filling and pairwise scoring, have shown impressive performance on various information extraction tasks, but they are difficult to scale to datasets with longer input texts because of their second-order space/time complexities with respect to the input length. In this work, we propose a Hybrid Span Generator (HySPA) that invertibly maps the information graph to an alternating sequence of nodes and edge types, and directly generates such sequences via a hybrid span decoder which can decode both the spans and the types recurrently in linear time and space complexities. Extensive experiments on the ACE05 dataset show that our approach also significantly outperforms state-of-the-art on the joint entity and relation extraction task.

View on arXiv PDF Code

Similar