CL IRJun 10, 2012

Temporal expression normalisation in natural language texts

arXiv:1206.2010v11 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of limited gold-standard resources for temporal expression annotation in information extraction, offering an incremental improvement over existing systems.

The authors tackled the problem of normalizing temporal expressions in English texts by developing a novel rule-based architecture that outperforms state-of-the-art systems on the TempEval-2 Shared Task and introduces a new free corpus of 2822 annotated expressions.

Automatic annotation of temporal expressions is a research challenge of great interest in the field of information extraction. In this report, I describe a novel rule-based architecture, built on top of a pre-existing system, which is able to normalise temporal expressions detected in English texts. Gold standard temporally-annotated resources are limited in size and this makes research difficult. The proposed system outperforms the state-of-the-art systems with respect to TempEval-2 Shared Task (value attribute) and achieves substantially better results with respect to the pre-existing system on top of which it has been developed. I will also introduce a new free corpus consisting of 2822 unique annotated temporal expressions. Both the corpus and the system are freely available on-line.

View on arXiv PDF

Similar