AI DBSep 24, 2020

Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases

Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek

arXiv:2009.11564v227.2149 citations

Originality Synthesis-oriented

AI Analysis

It addresses the problem of equipping machines with comprehensive world knowledge for AI applications, but is incremental as it focuses on surveying existing concepts and methods.

This article surveys methods for creating and curating large-scale knowledge bases from web and text sources, which are used to enhance search engines, natural language processing, and data analytics by semantically interpreting textual data.

Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines. This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics. This article surveys fundamental concepts and practical methods for creating and curating large knowledge bases. It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies. On top of this, the article discusses the automatic extraction of entity-centric properties. To support the long-term life-cycle and the quality assurance of machine knowledge, the article presents methods for constructing open schemas and for knowledge curation. Case studies on academic projects and industrial knowledge graphs complement the survey of concepts and methods.

View on arXiv PDF

Similar