AIDBMay 29, 2020

KGTK: A Toolkit for Large Knowledge Graph Manipulation and Analysis

arXiv:2006.00088v365 citations
Originality Synthesis-oriented
AI Analysis

This toolkit addresses the challenge for developers and data scientists in efficiently manipulating and analyzing large knowledge graphs, though it is incremental as it builds on existing data science libraries.

The authors tackled the problem of heterogeneous and difficult-to-integrate tools for large knowledge graph (KG) operations by developing KGTK, a data science-centric toolkit that represents graphs in tables and leverages popular libraries, enabling easier construction of KG pipelines for applications like integrating Wikidata, DBpedia, and ConceptNet.

Knowledge graphs (KGs) have become the preferred technology for representing, sharing and adding knowledge to modern AI applications. While KGs have become a mainstream technology, the RDF/SPARQL-centric toolset for operating with them at scale is heterogeneous, difficult to integrate and only covers a subset of the operations that are commonly needed in data science applications. In this paper we present KGTK, a data science-centric toolkit designed to represent, create, transform, enhance and analyze KGs. KGTK represents graphs in tables and leverages popular libraries developed for data science applications, enabling a wide audience of developers to easily construct knowledge graph pipelines for their applications. We illustrate the framework with real-world scenarios where we have used KGTK to integrate and manipulate large KGs, such as Wikidata, DBpedia and ConceptNet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes