CLApr 19, 2022

I still have Time(s): Extending HeidelTime for German Texts

arXiv:2204.08848v1584 citationsh-index: 24Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific issue for users of German text processing tools, but it is incremental as it builds on an existing method.

The authors tackled the problem of detecting temporal expressions in German texts by extending HeidelTime, a pattern-matching tool, to improve coverage, resulting in gains of 2.7% or 8.5% depending on overgeneralization.

HeidelTime is one of the most widespread and successful tools for detecting temporal expressions in texts. Since HeidelTime's pattern matching system is based on regular expression, it can be extended in a convenient way. We present such an extension for the German resources of HeidelTime: HeidelTime-EXT . The extension has been brought about by means of observing false negatives within real world texts and various time banks. The gain in coverage is 2.7% or 8.5%, depending on the admitted degree of potential overgeneralization. We describe the development of HeidelTime-EXT, its evaluation on text samples from various genres, and share some linguistic observations. HeidelTime ext can be obtained from https://github.com/texttechnologylab/heideltime.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes