CLMay 24, 2018

DSGAN: Generative Adversarial Training for Distant Supervision Relation Extraction

Pengda Qin, Weiran Xu, William Yang Wang

arXiv:1805.09929v132.61155 citations

Originality Incremental advance

AI Analysis

This addresses the noise issue in relation extraction for NLP researchers, offering a novel method for cleaning datasets, though it is incremental as it builds on existing adversarial approaches.

The paper tackles the noise labeling problem in distant supervision relation extraction by introducing DSGAN, an adversarial learning framework that filters false positive instances at the sentence level, resulting in significant performance improvements compared to state-of-the-art systems.

Distant supervision can effectively label data for relation extraction, but suffers from the noise labeling problem. Recent works mainly perform soft bag-level noise reduction strategies to find the relatively better samples in a sentence bag, which is suboptimal compared with making a hard decision of false positive samples in sentence level. In this paper, we introduce an adversarial learning framework, which we named DSGAN, to learn a sentence-level true-positive generator. Inspired by Generative Adversarial Networks, we regard the positive samples generated by the generator as the negative samples to train the discriminator. The optimal generator is obtained until the discrimination ability of the discriminator has the greatest decline. We adopt the generator to filter distant supervision training dataset and redistribute the false positive instances into the negative set, in which way to provide a cleaned dataset for relation classification. The experimental results show that the proposed strategy significantly improves the performance of distant supervision relation extraction comparing to state-of-the-art systems.

View on arXiv PDF

Similar