Multi-modal Knowledge Graph Generation with Semantics-enriched Prompts
This work addresses the scarcity of MMKGs for knowledge representation across domains, offering an incremental improvement in image selection for graph enrichment.
The paper tackled the challenge of constructing Multi-modal Knowledge Graphs (MMKGs) by generating higher-quality, contextually relevant images from conventional KGs, using a Visualizable Structural Neighbor Selection (VSNS) method that improved image quality and relevance on datasets like MKG-Y and DB15K.
Multi-modal Knowledge Graphs (MMKGs) have been widely applied across various domains for knowledge representation. However, the existing MMKGs are significantly fewer than required, and their construction faces numerous challenges, particularly in ensuring the selection of high-quality, contextually relevant images for knowledge graph enrichment. To address these challenges, we present a framework for constructing MMKGs from conventional KGs. Furthermore, to generate higher-quality images that are more relevant to the context in the given knowledge graph, we designed a neighbor selection method called Visualizable Structural Neighbor Selection (VSNS). This method consists of two modules: Visualizable Neighbor Selection (VNS) and Structural Neighbor Selection (SNS). The VNS module filters relations that are difficult to visualize, while the SNS module selects neighbors that most effectively capture the structural characteristics of the entity. To evaluate the quality of the generated images, we performed qualitative and quantitative evaluations on two datasets, MKG-Y and DB15K. The experimental results indicate that using the VSNS method to select neighbors results in higher-quality images that are more relevant to the knowledge graph.