Knowledge Graph Completion with Pre-trained Multimodal Transformer and Twins Negative Sampling
This work addresses knowledge graph completion for multimodal data, representing an incremental improvement over existing embedding-based and finetune-based methods.
The paper tackles the problem of incomplete multimodal knowledge graphs by proposing VBKGC, a VisualBERT-enhanced model that integrates deeply fused multimodal information and introduces twins negative sampling for co-design with the KGC model, achieving outstanding performance on link prediction tasks.
Knowledge graphs (KGs) that modelings the world knowledge as structural triples are inevitably incomplete. Such problems still exist for multimodal knowledge graphs (MMKGs). Thus, knowledge graph completion (KGC) is of great importance to predict the missing triples in the existing KGs. As for the existing KGC methods, embedding-based methods rely on manual design to leverage multimodal information while finetune-based approaches are not superior to embedding-based methods in link prediction. To address these problems, we propose a VisualBERT-enhanced Knowledge Graph Completion model (VBKGC for short). VBKGC could capture deeply fused multimodal information for entities and integrate them into the KGC model. Besides, we achieve the co-design of the KGC model and negative sampling by designing a new negative sampling strategy called twins negative sampling. Twins negative sampling is suitable for multimodal scenarios and could align different embeddings for entities. We conduct extensive experiments to show the outstanding performance of VBKGC on the link prediction task and make further exploration of VBKGC.