CLJun 30, 2021

HySPA: Hybrid Span Generation for Scalable Text-to-Graph Extraction

arXiv:2106.15838v1714 citations
Originality Highly original
AI Analysis

This addresses the problem of inefficient scaling for long texts in information extraction, offering a more scalable solution for researchers and practitioners.

The paper tackles the scalability issue in text-to-graph extraction by proposing HySPA, a hybrid span generator that maps graphs to sequences and decodes them linearly, achieving significant performance improvements on the ACE05 dataset.

Text-to-Graph extraction aims to automatically extract information graphs consisting of mentions and types from natural language texts. Existing approaches, such as table filling and pairwise scoring, have shown impressive performance on various information extraction tasks, but they are difficult to scale to datasets with longer input texts because of their second-order space/time complexities with respect to the input length. In this work, we propose a Hybrid Span Generator (HySPA) that invertibly maps the information graph to an alternating sequence of nodes and edge types, and directly generates such sequences via a hybrid span decoder which can decode both the spans and the types recurrently in linear time and space complexities. Extensive experiments on the ACE05 dataset show that our approach also significantly outperforms state-of-the-art on the joint entity and relation extraction task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes