I still have Time(s): Extending HeidelTime for German Texts
This work addresses a domain-specific issue for users of German text processing tools, but it is incremental as it builds on an existing method.
The authors tackled the problem of detecting temporal expressions in German texts by extending HeidelTime, a pattern-matching tool, to improve coverage, resulting in gains of 2.7% or 8.5% depending on overgeneralization.
HeidelTime is one of the most widespread and successful tools for detecting temporal expressions in texts. Since HeidelTime's pattern matching system is based on regular expression, it can be extended in a convenient way. We present such an extension for the German resources of HeidelTime: HeidelTime-EXT . The extension has been brought about by means of observing false negatives within real world texts and various time banks. The gain in coverage is 2.7% or 8.5%, depending on the admitted degree of potential overgeneralization. We describe the development of HeidelTime-EXT, its evaluation on text samples from various genres, and share some linguistic observations. HeidelTime ext can be obtained from https://github.com/texttechnologylab/heideltime.