LOME: Large Ontology Multilingual Extraction
This work addresses the need for efficient multilingual information extraction tools, offering a practical system with broad applicability, though it appears incremental as it builds on existing multilingual encoders and frameworks.
The paper tackles the problem of multilingual information extraction by presenting LOME, a system that identifies entity and event mentions, performs coreference resolution, entity typing, and temporal relation prediction to construct knowledge graphs, achieving competitive or superior performance compared to monolingual state-of-the-art methods.
We present LOME, a system for performing multilingual information extraction. Given a text document as input, our core system identifies spans of textual entity and event mentions with a FrameNet (Baker et al., 1998) parser. It subsequently performs coreference resolution, fine-grained entity typing, and temporal relation prediction between events. By doing so, the system constructs an event and entity focused knowledge graph. We can further apply third-party modules for other types of annotation, like relation extraction. Our (multilingual) first-party modules either outperform or are competitive with the (monolingual) state-of-the-art. We achieve this through the use of multilingual encoders like XLM-R (Conneau et al., 2020) and leveraging multilingual training data. LOME is available as a Docker container on Docker Hub. In addition, a lightweight version of the system is accessible as a web demo.