CLApr 22, 2017

Lexical Features in Coreference Resolution: To be Used With Caution

arXiv:1704.06779v18.039 citations

Originality Synthesis-oriented

AI Analysis

This addresses domain generalization issues in coreference resolution for NLP practitioners, but is incremental as it critiques existing approaches.

The paper investigates a drawback of lexical features in coreference resolvers, showing they hinder generalization to unseen domains, and identifies flaws in current evaluation methods due to dataset overlap.

Lexical features are a major source of information in state-of-the-art coreference resolvers. Lexical features implicitly model some of the linguistic phenomena at a fine granularity level. They are especially useful for representing the context of mentions. In this paper we investigate a drawback of using many lexical features in state-of-the-art coreference resolvers. We show that if coreference resolvers mainly rely on lexical features, they can hardly generalize to unseen domains. Furthermore, we show that the current coreference resolution evaluation is clearly flawed by only evaluating on a specific split of a specific dataset in which there is a notable overlap between the training, development and test sets.

View on arXiv PDF

Similar