Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach
This addresses the annotation bottleneck for researchers and practitioners in information extraction, though it is incremental as it builds on existing representation learning approaches.
The paper tackles the problem of costly human annotations in relation extraction by proposing REHession, a framework that uses heterogeneous supervision from sources like knowledge bases and heuristics, and it demonstrates superiority over state-of-the-art methods in experiments.
Relation extraction is a fundamental task in information extraction. Most existing methods have heavy reliance on annotations labeled by human experts, which are costly and time-consuming. To overcome this drawback, we propose a novel framework, REHession, to conduct relation extractor learning using annotations from heterogeneous information source, e.g., knowledge base and domain heuristics. These annotations, referred as heterogeneous supervision, often conflict with each other, which brings a new challenge to the original relation extraction task: how to infer the true label from noisy labels for a given instance. Identifying context information as the backbone of both relation extraction and true label discovery, we adopt embedding techniques to learn the distributed representations of context, which bridges all components with mutual enhancement in an iterative fashion. Extensive experimental results demonstrate the superiority of REHession over the state-of-the-art.