AIDBLGNov 23, 2025

KGpipe: Generation and Evaluation of Pipelines for Data Integration into Knowledge Graphs

arXiv:2511.18364v12 citations
Originality Synthesis-oriented
AI Analysis

This addresses the lack of support for combining diverse data integration methods into effective pipelines for researchers and practitioners in knowledge graph construction, though it is incremental as it builds on existing tools.

The authors tackled the problem of building high-quality knowledge graphs from diverse sources by introducing KGpipe, a framework for defining and executing reproducible end-to-end integration pipelines that combine existing tools or LLM functionality, and they demonstrated its flexibility by evaluating several pipelines with performance and quality metrics.

Building high-quality knowledge graphs (KGs) from diverse sources requires combining methods for information extraction, data transformation, ontology mapping, entity matching, and data fusion. Numerous methods and tools exist for each of these tasks, but support for combining them into reproducible and effective end-to-end pipelines is still lacking. We present a new framework, KGpipe for defining and executing integration pipelines that can combine existing tools or LLM (Large Language Model) functionality. To evaluate different pipelines and the resulting KGs, we propose a benchmark to integrate heterogeneous data of different formats (RDF, JSON, text) into a seed KG. We demonstrate the flexibility of KGpipe by running and comparatively evaluating several pipelines integrating sources of the same or different formats using selected performance and quality metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes