Leveraging Unlabeled Data for Entity-Relation Extraction through Probabilistic Constraint Satisfaction
This addresses the problem of improving extraction accuracy for researchers in NLP with limited labeled data, though it is incremental as it builds on prior constraint integration methods.
The paper tackled entity-relation extraction with symbolic domain knowledge by using semantic loss to precisely enforce logical constraints, showing it outperforms baselines by a wide margin in low-data regimes.
We study the problem of entity-relation extraction in the presence of symbolic domain knowledge. Such knowledge takes the form of an ontology defining relations and their permissible arguments. Previous approaches set out to integrate such knowledge in their learning approaches either through self-training, or through approximations that lose the precise meaning of the logical expressions. By contrast, our approach employs semantic loss which captures the precise meaning of a logical sentence through maintaining a probability distribution over all possible states, and guiding the model to solutions which minimize any constraint violations. With a focus on low-data regimes, we show that semantic loss outperforms the baselines by a wide margin.