Massively Multi-Lingual Event Understanding: Extraction, Visualization, and Search
This work addresses the challenge of event understanding for users needing to analyze multilingual data, offering a practical tool for cross-lingual event extraction and search, though it builds incrementally on existing methods.
The paper tackles the problem of event extraction and search across multiple languages by introducing ISI-Clear, a cross-lingual, zero-shot system that processes text in 100 languages using only English training data, enabling on-demand access to global events with integrated visualization and search capabilities.
In this paper, we present ISI-Clear, a state-of-the-art, cross-lingual, zero-shot event extraction system and accompanying user interface for event visualization & search. Using only English training data, ISI-Clear makes global events available on-demand, processing user-supplied text in 100 languages ranging from Afrikaans to Yiddish. We provide multiple event-centric views of extracted events, including both a graphical representation and a document-level summary. We also integrate existing cross-lingual search algorithms with event extraction capabilities to provide cross-lingual event-centric search, allowing English-speaking users to search over events automatically extracted from a corpus of non-English documents, using either English natural language queries (e.g. cholera outbreaks in Iran) or structured queries (e.g. find all events of type Disease-Outbreak with agent cholera and location Iran).