On the complexity of finding set repairs for data-graphs
This addresses the challenge of ensuring data consistency in graph databases for applications relying on interconnected data, though it is incremental as it extends repair concepts from relational to graph settings.
The paper tackles the problem of computing subset and superset repairs for graph databases that violate integrity constraints defined by Reg-GXPath expressions, showing that these problems are polynomial-time solvable for positive fragments but intractable for the full language.
In the deeply interconnected world we live in, pieces of information link domains all around us. As graph databases embrace effectively relationships among data and allow processing and querying these connections efficiently, they are rapidly becoming a popular platform for storage that supports a wide range of domains and applications. As in the relational case, it is expected that data preserves a set of integrity constraints that define the semantic structure of the world it represents. When a database does not satisfy its integrity constraints, a possible approach is to search for a 'similar' database that does satisfy the constraints, also known as a repair. In this work, we study the problem of computing subset and superset repairs for graph databases with data values using a notion of consistency based on a set of Reg-GXPath expressions as integrity constraints. We show that for positive fragments of Reg-GXPath these problems admit a polynomial-time algorithm, while the full expressive power of the language renders them intractable.