CLJan 23, 2018

What did you Mention? A Large Scale Mention Detection Benchmark for Spoken and Written Text

Yosi Mass, Lili Kotlerman, Shachar Mirkin, Elad Venezian, Gera Witzling, Noam Slonim

arXiv:1801.07507v30.52 citations

Originality Synthesis-oriented

AI Analysis

This provides a standardized evaluation tool for researchers and practitioners in natural language processing, particularly for mention detection tasks, but it is incremental as it builds on existing benchmark concepts.

The authors tackled the lack of a comprehensive benchmark for mention detection by creating a large, high-quality dataset annotated for named and other entities across clean and noisy text, including spoken data, and demonstrated results from a state-of-the-art system on it.

We describe a large, high-quality benchmark for the evaluation of Mention Detection tools. The benchmark contains annotations of both named entities as well as other types of entities, annotated on different types of text, ranging from clean text taken from Wikipedia, to noisy spoken data. The benchmark was built through a highly controlled crowd sourcing process to ensure its quality. We describe the benchmark, the process and the guidelines that were used to build it. We then demonstrate the results of a state-of-the-art system running on that benchmark.

View on arXiv PDF

Similar