CLJan 28, 2021

LOME: Large Ontology Multilingual Extraction

arXiv:2101.12175v2804 citations
AI Analysis

This work addresses the need for efficient multilingual information extraction tools, offering a practical system with broad applicability, though it appears incremental as it builds on existing multilingual encoders and frameworks.

The paper tackles the problem of multilingual information extraction by presenting LOME, a system that identifies entity and event mentions, performs coreference resolution, entity typing, and temporal relation prediction to construct knowledge graphs, achieving competitive or superior performance compared to monolingual state-of-the-art methods.

We present LOME, a system for performing multilingual information extraction. Given a text document as input, our core system identifies spans of textual entity and event mentions with a FrameNet (Baker et al., 1998) parser. It subsequently performs coreference resolution, fine-grained entity typing, and temporal relation prediction between events. By doing so, the system constructs an event and entity focused knowledge graph. We can further apply third-party modules for other types of annotation, like relation extraction. Our (multilingual) first-party modules either outperform or are competitive with the (monolingual) state-of-the-art. We achieve this through the use of multilingual encoders like XLM-R (Conneau et al., 2020) and leveraging multilingual training data. LOME is available as a Docker container on Docker Hub. In addition, a lightweight version of the system is accessible as a web demo.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes