Temporal expression normalisation in natural language texts
This work addresses the challenge of limited gold-standard resources for temporal expression annotation in information extraction, offering an incremental improvement over existing systems.
The authors tackled the problem of normalizing temporal expressions in English texts by developing a novel rule-based architecture that outperforms state-of-the-art systems on the TempEval-2 Shared Task and introduces a new free corpus of 2822 annotated expressions.
Automatic annotation of temporal expressions is a research challenge of great interest in the field of information extraction. In this report, I describe a novel rule-based architecture, built on top of a pre-existing system, which is able to normalise temporal expressions detected in English texts. Gold standard temporally-annotated resources are limited in size and this makes research difficult. The proposed system outperforms the state-of-the-art systems with respect to TempEval-2 Shared Task (value attribute) and achieves substantially better results with respect to the pre-existing system on top of which it has been developed. I will also introduce a new free corpus consisting of 2822 unique annotated temporal expressions. Both the corpus and the system are freely available on-line.