CL LG MLSep 6, 2019

Learning in Text Streams: Discovery and Disambiguation of Entity and Relation Instances

Marco Maggini, Giuseppe Marra, Stefano Melacci, Andrea Zugarini

arXiv:1909.05367v30.515 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of real-time knowledge base construction from dynamic text for AI agents, though it is incremental as it builds on existing memory network approaches.

The paper tackles the problem of online entity and relation discovery and disambiguation from text streams, proposing a memory network that builds a knowledge base and improves with unsupervised reading, achieving strong performance on Wikipedia and a new public dataset.

We consider a scenario where an artificial agent is reading a stream of text composed of a set of narrations, and it is informed about the identity of some of the individuals that are mentioned in the text portion that is currently being read. The agent is expected to learn to follow the narrations, thus disambiguating mentions and discovering new individuals. We focus on the case in which individuals are entities and relations, and we propose an end-to-end trainable memory network that learns to discover and disambiguate them in an online manner, performing one-shot learning, and dealing with a small number of sparse supervisions. Our system builds a not-given-in-advance knowledge base, and it improves its skills while reading unsupervised text. The model deals with abrupt changes in the narration, taking into account their effects when resolving co-references. We showcase the strong disambiguation and discovery skills of our model on a corpus of Wikipedia documents and on a newly introduced dataset, that we make publicly available.

View on arXiv PDF

Similar