AINov 25, 2025

Schema Matching on Graph: Iterative Graph Exploration for Efficient and Explainable Data Integration

arXiv:2511.20285v21 citations
Originality Incremental advance
AI Analysis

This addresses the problem of efficient and reliable data integration for medical professionals and researchers, though it is incremental as it builds on existing KG-augmented LLM approaches.

The paper tackles schema matching in data integration, particularly for aligning Electronic Health Record systems to standard models, by introducing SMoG, a framework that uses iterative 1-hop SPARQL queries on knowledge graphs to reduce storage needs and improve explainability, achieving performance comparable to state-of-the-art baselines on real-world medical datasets.

Schema matching is a critical task in data integration, particularly in the medical domain where disparate Electronic Health Record (EHR) systems must be aligned to standard models like OMOP CDM. While Large Language Models (LLMs) have shown promise in schema matching, they suffer from hallucination and lack of up-to-date domain knowledge. Knowledge Graphs (KGs) offer a solution by providing structured, verifiable knowledge. However, existing KG-augmented LLM approaches often rely on inefficient complex multi-hop queries or storage-intensive vector-based retrieval methods. This paper introduces SMoG (Schema Matching on Graph), a novel framework that leverages iterative execution of simple 1-hop SPARQL queries, inspired by successful strategies in Knowledge Graph Question Answering (KGQA). SMoG enhances explainability and reliability by generating human-verifiable query paths while significantly reducing storage requirements by directly querying SPARQL endpoints. Experimental results on real-world medical datasets demonstrate that SMoG achieves performance comparable to state-of-the-art baselines, validating its effectiveness and efficiency in KG-augmented schema matching.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes