Joint Bootstrapping Machines for High Confidence Relation Extraction
This addresses the challenge of low-confidence relation extraction for NLP researchers, though it is incremental as it builds on existing bootstrapping techniques.
The paper tackles the problem of semantic drift in semi-supervised bootstrapping for relation extraction by introducing BREX, a method that uses joint entity and template seeds with parallel expansion and improved similarity measures, achieving an F1 score improvement of 0.13 over the state of the art for four relationships.
Semi-supervised bootstrapping techniques for relationship extraction from text iteratively expand a set of initial seed instances. Due to the lack of labeled data, a key challenge in bootstrapping is semantic drift: if a false positive instance is added during an iteration, then all following iterations are contaminated. We introduce BREX, a new bootstrapping method that protects against such contamination by highly effective confidence assessment. This is achieved by using entity and template seeds jointly (as opposed to just one as in previous work), by expanding entities and templates in parallel and in a mutually constraining fashion in each iteration and by introducing higherquality similarity measures for templates. Experimental results show that BREX achieves an F1 that is 0.13 (0.87 vs. 0.74) better than the state of the art for four relationships.