Generating Information Extraction Patterns from Overlapping and Variable Length Annotations using Sequence Alignment
This work addresses the challenge of handling overlapping and variable-length annotations in information extraction, which is important for researchers in natural language processing, but it appears incremental as it builds on existing sequence alignment techniques.
The paper tackled the problem of generating information extraction patterns from overlapping and variable-length annotations by using sequence alignment to capture multi-level patterns and determine optimal context windows, eliminating the need for predefined fixed windows. It was evaluated on the CoNLL-2003 NER task, but no concrete performance numbers were provided in the abstract.
Sequence alignments are used to capture patterns composed of elements representing multiple conceptual levels through the alignment of sequences that contain overlapping and variable length annotations. The alignments also determine the proper context window of words and phrases that most directly impact the meaning of a given target within a sentence, eliminating the need to predefine a fixed context window of words surrounding the targets. We evaluated the system using the CoNLL-2003 named entity recognition (NER) task.