What did you Mention? A Large Scale Mention Detection Benchmark for Spoken and Written Text
This provides a standardized evaluation tool for researchers and practitioners in natural language processing, particularly for mention detection tasks, but it is incremental as it builds on existing benchmark concepts.
The authors tackled the lack of a comprehensive benchmark for mention detection by creating a large, high-quality dataset annotated for named and other entities across clean and noisy text, including spoken data, and demonstrated results from a state-of-the-art system on it.
We describe a large, high-quality benchmark for the evaluation of Mention Detection tools. The benchmark contains annotations of both named entities as well as other types of entities, annotated on different types of text, ranging from clean text taken from Wikipedia, to noisy spoken data. The benchmark was built through a highly controlled crowd sourcing process to ensure its quality. We describe the benchmark, the process and the guidelines that were used to build it. We then demonstrate the results of a state-of-the-art system running on that benchmark.