LG AI MLMay 10, 2024

PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning

Jaejun Lee, Minsung Hwang, Joyce Jiyoung Whang

arXiv:2405.06418v26.42 citationsh-index: 16Has CodeICML

Originality Incremental advance

AI Analysis

This provides theoretical grounding for practical design choices in knowledge graph representation learning, addressing a gap for researchers and practitioners in the field, though it is incremental in extending existing theory to a new domain.

The authors tackled the lack of theoretical analysis in knowledge graph representation learning by presenting the first PAC-Bayesian generalization bounds for such methods, using a framework that covers at least 15 existing models and empirically showing that these bounds explain actual generalization errors on three real-world knowledge graphs.

While a number of knowledge graph representation learning (KGRL) methods have been proposed over the past decade, very few theoretical analyses have been conducted on them. In this paper, we present the first PAC-Bayesian generalization bounds for KGRL methods. To analyze a broad class of KGRL models, we propose a generic framework named ReED (Relation-aware Encoder-Decoder), which consists of a relation-aware message passing encoder and a triplet classification decoder. Our ReED framework can express at least 15 different existing KGRL models, including not only graph neural network-based models such as R-GCN and CompGCN but also shallow-architecture models such as RotatE and ANALOGY. Our generalization bounds for the ReED framework provide theoretical grounds for the commonly used tricks in KGRL, e.g., parameter-sharing and weight normalization schemes, and guide desirable design choices for practical KGRL methods. We empirically show that the critical factors in our generalization bounds can explain actual generalization errors on three real-world knowledge graphs.

View on arXiv PDF Code

Similar