gBuilder: A Scalable Knowledge Graph Construction System for Unstructured Corpus
This work addresses the need for scalable and flexible knowledge graph construction systems for researchers and practitioners dealing with unstructured data, though it appears incremental as it builds on existing information extraction models.
The authors tackled the problem of constructing knowledge graphs from unstructured text by developing gBuilder, a scalable system that organizes multiple information extraction models in a uniform platform, with experimental results confirming its high scalability on large-scale tasks.
We design a user-friendly and scalable knowledge graph construction (KGC) system for extracting structured knowledge from the unstructured corpus. Different from existing KGC systems, gBuilder provides a flexible and user-defined pipeline to embrace the rapid development of IE models. More built-in template-based or heuristic operators and programmable operators are available for adapting to data from different domains. Furthermore, we also design a cloud-based self-adaptive task scheduling for gBuilder to ensure its scalability on large-scale knowledge graph construction. Experimental evaluation demonstrates the ability of gBuilder to organize multiple information extraction models for knowledge graph construction in a uniform platform, and confirms its high scalability on large-scale KGC tasks.