DBAIMay 19, 2021

Stratified Data Integration

arXiv:2105.09432v117 citations
Originality Synthesis-oriented
AI Analysis

This addresses data integration challenges for users dealing with semantic heterogeneity, but it appears incremental as it builds on existing layered representation ideas.

The authors tackled semantic heterogeneity in data integration by proposing a stratified representation approach with conceptual, language, knowledge, and data layers, enabling uniform handling of different heterogeneity types; the framework was evaluated in pilot case studies and industrial problems.

We propose a novel approach to the problem of semantic heterogeneity where data are organized into a set of stratified and independent representation layers, namely: conceptual(where a set of unique alinguistic identifiers are connected inside a graph codifying their meaning), language(where sets of synonyms, possibly from multiple languages, annotate concepts), knowledge(in the form of a graph where nodes are entity types and links are properties), and data(in the form of a graph of entities populating the previous knowledge graph). This allows us to state the problem of semantic heterogeneity as a problem of Representation Diversity where the different types of heterogeneity, viz. Conceptual, Language, Knowledge, and Data, are uniformly dealt within each single layer, independently from the others. In this paper we describe the proposed stratified representation of data and the process by which data are first transformed into the target representation, then suitably integrated and then, finally, presented to the user in her preferred format. The proposed framework has been evaluated in various pilot case studies and in a number of industrial data integration problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes