LGAICLSIMar 27, 2023

Mutually-paced Knowledge Distillation for Cross-lingual Temporal Knowledge Graph Reasoning

Amazon
arXiv:2303.14898v115 citationsh-index: 87
Originality Incremental advance
AI Analysis

It addresses reasoning on incomplete temporal knowledge graphs in low-resource languages, which is an incremental improvement for cross-lingual AI applications.

This paper tackles cross-lingual temporal knowledge graph reasoning by proposing a mutually-paced knowledge distillation model (MP-KD) that transfers knowledge from high-resource to low-resource languages, addressing challenges of scarce alignments and temporal discrepancies through pseudo-alignment generation and attention mechanisms, with experiments on twelve tasks showing effectiveness.

This paper investigates cross-lingual temporal knowledge graph reasoning problem, which aims to facilitate reasoning on Temporal Knowledge Graphs (TKGs) in low-resource languages by transfering knowledge from TKGs in high-resource ones. The cross-lingual distillation ability across TKGs becomes increasingly crucial, in light of the unsatisfying performance of existing reasoning methods on those severely incomplete TKGs, especially in low-resource languages. However, it poses tremendous challenges in two aspects. First, the cross-lingual alignments, which serve as bridges for knowledge transfer, are usually too scarce to transfer sufficient knowledge between two TKGs. Second, temporal knowledge discrepancy of the aligned entities, especially when alignments are unreliable, can mislead the knowledge distillation process. We correspondingly propose a mutually-paced knowledge distillation model MP-KD, where a teacher network trained on a source TKG can guide the training of a student network on target TKGs with an alignment module. Concretely, to deal with the scarcity issue, MP-KD generates pseudo alignments between TKGs based on the temporal information extracted by our representation module. To maximize the efficacy of knowledge transfer and control the noise caused by the temporal knowledge discrepancy, we enhance MP-KD with a temporal cross-lingual attention mechanism to dynamically estimate the alignment strength. The two procedures are mutually paced along with model training. Extensive experiments on twelve cross-lingual TKG transfer tasks in the EventKG benchmark demonstrate the effectiveness of the proposed MP-KD method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes