CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction
This work addresses noise reduction in relation extraction for natural language processing, offering a novel approach that enhances model performance without requiring costly extra annotations.
The paper tackles the problem of noise in distantly supervised relation extraction by proposing a contrastive instance learning framework, which significantly improves performance over previous methods on datasets like NYT10, GDS, and KBP.
The journey of reducing noise from distant supervision (DS) generated training data has been started since the DS was first introduced into the relation extraction (RE) task. For the past decade, researchers apply the multi-instance learning (MIL) framework to find the most reliable feature from a bag of sentences. Although the pattern of MIL bags can greatly reduce DS noise, it fails to represent many other useful sentence features in the datasets. In many cases, these sentence features can only be acquired by extra sentence-level human annotation with heavy costs. Therefore, the performance of distantly supervised RE models is bounded. In this paper, we go beyond typical MIL framework and propose a novel contrastive instance learning (CIL) framework. Specifically, we regard the initial MIL as the relational triple encoder and constraint positive pairs against negative pairs for each instance. Experiments demonstrate the effectiveness of our proposed framework, with significant improvements over the previous methods on NYT10, GDS and KBP.