A Joint Named-Entity Recognizer for Heterogeneous Tag-sets Using a Tag Hierarchy
This addresses domain adaptation challenges in NLP for researchers and practitioners dealing with inconsistent annotation schemes, though it is incremental as it builds on existing neural methods.
The paper tackles the problem of named-entity recognition with multiple heterogeneously tagged training sets and a different test tag-set, using a tag hierarchy to guide learning, and shows that their proposed model outperforms independent or multitasking models, especially in complex tag-set consolidation scenarios.
We study a variant of domain adaptation for named-entity recognition where multiple, heterogeneously tagged training sets are available. Furthermore, the test tag-set is not identical to any individual training tag-set. Yet, the relations between all tags are provided in a tag hierarchy, covering the test tags as a combination of training tags. This setting occurs when various datasets are created using different annotation schemes. This is also the case of extending a tag-set with a new tag by annotating only the new tag in a new dataset. We propose to use the given tag hierarchy to jointly learn a neural network that shares its tagging layer among all tag-sets. We compare this model to combining independent models and to a model based on the multitasking approach. Our experiments show the benefit of the tag-hierarchy model, especially when facing non-trivial consolidation of tag-sets.