CLAILGMMSep 2, 2022

Multi-modal Contrastive Representation Learning for Entity Alignment

Tencent
arXiv:2209.00891v1600 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses entity alignment in multi-modal knowledge graphs, which is important for integrating heterogeneous data sources, though it appears incremental as it builds on existing contrastive learning approaches.

The paper tackles the problem of multi-modal entity alignment between knowledge graphs with structural triples and images, proposing MCLEA to address modality heterogeneity. The model outperforms state-of-the-art baselines on public datasets in supervised and unsupervised settings.

Multi-modal entity alignment aims to identify equivalent entities between two different multi-modal knowledge graphs, which consist of structural triples and images associated with entities. Most previous works focus on how to utilize and encode information from different modalities, while it is not trivial to leverage multi-modal knowledge in entity alignment because of the modality heterogeneity. In this paper, we propose MCLEA, a Multi-modal Contrastive Learning based Entity Alignment model, to obtain effective joint representations for multi-modal entity alignment. Different from previous works, MCLEA considers task-oriented modality and models the inter-modal relationships for each entity representation. In particular, MCLEA firstly learns multiple individual representations from multiple modalities, and then performs contrastive learning to jointly model intra-modal and inter-modal interactions. Extensive experimental results show that MCLEA outperforms state-of-the-art baselines on public datasets under both supervised and unsupervised settings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes