Cross-stitching Text and Knowledge Graph Encoders for Distantly Supervised Relation Extraction
This addresses a bottleneck in relation extraction for NLP applications, though it is incremental over prior bi-encoder architectures.
The paper tackled the problem of limited interaction between text and knowledge graph encoders in distantly-supervised relation extraction by introducing a cross-stitch mechanism for full bidirectional sharing, resulting in strong improvements on two benchmarks from different domains.
Bi-encoder architectures for distantly-supervised relation extraction are designed to make use of the complementary information found in text and knowledge graphs (KG). However, current architectures suffer from two drawbacks. They either do not allow any sharing between the text encoder and the KG encoder at all, or, in case of models with KG-to-text attention, only share information in one direction. Here, we introduce cross-stitch bi-encoders, which allow full interaction between the text encoder and the KG encoder via a cross-stitch mechanism. The cross-stitch mechanism allows sharing and updating representations between the two encoders at any layer, with the amount of sharing being dynamically controlled via cross-attention-based gates. Experimental results on two relation extraction benchmarks from two different domains show that enabling full interaction between the two encoders yields strong improvements.