Reinforcement Learning-based N-ary Cross-Sentence Relation Extraction
This work addresses data quality issues in relation extraction for NLP applications, representing an incremental improvement over existing methods.
The paper tackles the problem of noisy labeled data and missed non-consecutive sentences in n-ary cross-sentence relation extraction by relaxing distant supervision assumptions and using a reinforcement learning-based sentence distribution estimator to select correct sentences, achieving better performance than baselines.
The models of n-ary cross sentence relation extraction based on distant supervision assume that consecutive sentences mentioning n entities describe the relation of these n entities. However, on one hand, this assumption introduces noisy labeled data and harms the models' performance. On the other hand, some non-consecutive sentences also describe one relation and these sentences cannot be labeled under this assumption. In this paper, we relax this strong assumption by a weaker distant supervision assumption to address the second issue and propose a novel sentence distribution estimator model to address the first problem. This estimator selects correctly labeled sentences to alleviate the effect of noisy data is a two-level agent reinforcement learning model. In addition, a novel universal relation extractor with a hybrid approach of attention mechanism and PCNN is proposed such that it can be deployed in any tasks, including consecutive and nonconsecutive sentences. Experiments demonstrate that the proposed model can reduce the impact of noisy data and achieve better performance on general n-ary cross sentence relation extraction task compared to baseline models.