Multi-Modal Knowledge Graph Construction and Application: A Survey
It provides a comprehensive overview for researchers and practitioners working on knowledge engineering and multi-modal AI, but is incremental as it synthesizes existing work rather than introducing new methods.
This survey addresses the limitation of symbolic knowledge graphs by exploring multi-modal knowledge graphs (MMKGs) that integrate text and images to enhance machine understanding of the real world, reviewing challenges, progress, and applications in the field.
Recent years have witnessed the resurgence of knowledge engineering which is featured by the fast growth of knowledge graphs. However, most of existing knowledge graphs are represented with pure symbols, which hurts the machine's capability to understand the real world. The multi-modalization of knowledge graphs is an inevitable key step towards the realization of human-level machine intelligence. The results of this endeavor are Multi-modal Knowledge Graphs (MMKGs). In this survey on MMKGs constructed by texts and images, we first give definitions of MMKGs, followed with the preliminaries on multi-modal tasks and techniques. We then systematically review the challenges, progresses and opportunities on the construction and application of MMKGs respectively, with detailed analyses of the strength and weakness of different solutions. We finalize this survey with open research problems relevant to MMKGs.