Leopoldo Bertossi

h-index8

39papers

513citations

Novelty34%

AI Score45

Ranked #69,664 of 205,806 authors (top 34%)#141 in DB (top 27%)

39 Papers

AIMar 6, 2023

Attribution-Scores and Causal Counterfactuals as Explanations in Artificial Intelligence

Leopoldo Bertossi

In this expository article we highlight the relevance of explanations for artificial intelligence, in general, and for the newer developments in {\em explainable AI}, referring to origins and connections of and among different approaches. We describe in simple terms, explanations in data management and machine learning that are based on attribution-scores, and counterfactuals as found in the area of causality. We elaborate on the importance of logical reasoning when dealing with counterfactuals, and their use for score computation.

DBJun 15, 2023

From Database Repairs to Causality in Databases and Beyond

Leopoldo Bertossi

We describe some recent approaches to score-based explanations for query answers in databases. The focus is on work done by the author and collaborators. Special emphasis is placed on the use of counterfactual reasoning for score specification and computation. Several examples that illustrate the flexibility of these methods are shown.

AIMar 11, 2023

Efficient Computation of Shap Explanation Scores for Neural Network Classifiers via Knowledge Compilation

Leopoldo Bertossi, Jorge E. Leon

The use of Shap scores has become widespread in Explainable AI. However, their computation is in general intractable, in particular when done with a black-box classifier, such as neural network. Recent research has unveiled classes of open-box Boolean Circuit classifiers for which Shap can be computed efficiently. We show how to transform binary neural networks into those circuits for efficient Shap computation.We use logic-based knowledge compilation techniques. The performance gain is huge, as we show in the light of our experiments.

14.8DBApr 7

Database Querying under Missing Values Governed by Missingness Mechanisms

Leopoldo Bertossi, Farouk Toumani, Maxime Buron

We address the problems of giving a semantics to- and doing query answering (QA) on a relational database (RDB) that has missing values (MVs). The causes for the latter are governed by a Missingness Mechanism that is modelled as a Bayesian Network, which represents a Missingness Graph (MG) and involves the DB attributes. Our approach considerable departs from the treatment of RDBs with NULL (values). The MG together with the observed DB allow to build a block-independent probabilistic DB, on which basis we propose two QA techniques that jointly capture probabilistic uncertainty and statistical plausibility of the implicit imputation of MVs. We obtain complexity results that characterize the computational feasibility of those approaches.

DBJul 31, 2023

Attribution-Scores in Data Management and Explainable Machine Learning

Leopoldo Bertossi

We describe recent research on the use of actual causality in the definition of responsibility scores as explanations for query answers in databases, and for outcomes from classification models in machine learning. In the case of databases, useful connections with database repairs are illustrated and exploited. Repairs are also used to give a quantitative measure of the consistency of a database. For classification models, the responsibility score is properly extended and illustrated. The efficient computation of Shap-score is also analyzed and discussed. The emphasis is placed on work done by the author and collaborators.

LOSep 25, 2022

Answer-Set Programs for Repair Updates and Counterfactual Interventions

Leopoldo Bertossi

We briefly describe -- mainly through very simple examples -- different kinds of answer-set programs with annotations that have been proposed for specifying: database repairs and consistent query answering; secrecy view and query evaluation with them; counterfactual interventions for causality in databases; and counterfactual-based explanations in machine learning.

6.4AIMay 9

Reconciling Consistency-Based Diagnosis with Actual-Causality-Based Explanations

Leopoldo Bertossi

We establish, from the point of view of Explainable AI (XAI), connections between Consistency-Based Diagnosis (CBD), on one side, and Actual Causality and Causal Responsibility, on the other. CBD has received little attention from the XAI community. Connections between these two areas could have a fruitful impact on XAI and Explainable Data Management.

DBFeb 4, 2025

The Causal-Effect Score in Data Management

Felipe Azua, Leopoldo Bertossi

The Causal Effect (CE) is a numerical measure of causal influence of variables on observed results. Despite being widely used in many areas, only preliminary attempts have been made to use CE as an attribution score in data management, to measure the causal strength of tuples for query answering in databases. In this work, we introduce, generalize and investigate the so-called Causal-Effect Score in the context of classical and probabilistic databases.

DBMar 18, 2025

Causality-Based Scores Alignment in Explainable Data Management

Felipe Azua, Leopoldo Bertossi

Different attribution scores have been proposed to quantify the relevance of database tuples for query answering in databases; e.g. Causal Responsibility, the Shapley Value, the Banzhaf Power-Index, and the Causal Effect. They have been analyzed in isolation. This work is a first investigation of score alignment depending on the query and the database; i.e. on whether they induce compatible rankings of tuples. We concentrate mostly on causality-based scores; and provide a syntactic dichotomy result for queries: on one side, pairs of scores are always aligned, on the other, they are not always aligned. It turns out that the presence of exogenous tuples makes a crucial difference in this regard.

AIJan 23, 2024

The Distributional Uncertainty of the SHAP score in Explainable Machine Learning

Santiago Cifuentes, Leopoldo Bertossi, Nina Pardal et al.

Attribution scores reflect how important the feature values in an input entity are for the output of a machine learning model. One of the most popular attribution scores is the SHAP score, which is an instantiation of the general Shapley value used in coalition game theory. The definition of this score relies on a probability distribution on the entity population. Since the exact distribution is generally unknown, it needs to be assigned subjectively or be estimated from data, which may lead to misleading feature scores. In this paper, we propose a principled framework for reasoning on SHAP scores under unknown entity population distributions. In our framework, we consider an uncertainty region that contains the potential distributions, and the SHAP score of a feature becomes a function defined over this region. We study the basic problems of finding maxima and minima of this function, which allows us to determine tight ranges for the SHAP scores of all features. In particular, we pinpoint the complexity of these problems, and other related ones, showing them to be NP-complete. Finally, we present experiments on a real-world dataset, showing that our framework may contribute to a more robust feature scoring.

DBNov 19, 2025

Sufficient Explanations in Databases and their Connections to Necessary Explanations and Repairs

Leopoldo Bertossi, Nina Pardal

The notion of cause, as formalized by Halpern and Pearl, has been recently applied to relational databases, to characterize and compute causal explanations for query answers. In this work we consider the alternative notion of sufficient explanation. We investigate its connections with database repairs as used for dealing with inconsistent databases, and with causality-based necessary explanations. We also obtain some computational results.

AIAug 25, 2021

Reasoning about Counterfactuals and Explanations: Problems, Results and Directions

Leopoldo Bertossi

There are some recent approaches and results about the use of answer-set programming for specifying counterfactual interventions on entities under classification, and reasoning about them. These approaches are flexible and modular in that they allow the seamless addition of domain knowledge. Reasoning is enabled by query answering from the answer-set program. The programs can be used to specify and compute responsibility-based numerical scores as attributive explanations for classification results.

DBAug 19, 2021

Second-Order Specifications and Quantifier Elimination for Consistent Query Answering in Databases

Leopoldo Bertossi

Consistent answers to a query from a possibly inconsistent database are answers that are simultaneously retrieved from every possible repair of the database. Repairs are consistent instances that minimally differ from the original inconsistent instance. It has been shown before that database repairs can be specified as the stable models of a disjunctive logic program. In this paper we show how to use the repair programs to transform the problem of consistent query answering into a problem of reasoning w.r.t. a theory written in second-order predicate logic. It also investigated how a first-order theory can be obtained instead by applying second-order quantifier elimination techniques.

DBAug 2, 2021

Extending Sticky-Datalog+/- via Finite-Position Selection Functions: Tractability, Algorithms, and Optimization

Leopoldo Bertossi, Mostafa Milani

Weakly-Sticky(WS) Datalog+/- is an expressive member of the family of Datalog+/- program classes that is defined on the basis of the conditions of stickiness and weak-acyclicity. Conjunctive query answering (QA) over the WS programs has been investigated, and its tractability in data complexity has been established. However, the design and implementation of practical QA algorithms and their optimizations have been open. In order to fill this gap, we first study Sticky and WS programs from the point of view of the behavior of the chase procedure. We extend the stickiness property of the chase to that of generalized stickiness of the chase (GSCh) modulo an oracle that selects (and provides) the predicate positions where finitely values appear during the chase. Stickiness modulo a selection function S that provides only a subset of those positions defines sch(S), a semantic subclass of GSCh. Program classes with selection functions include Sticky and WS, and another syntactic class that we introduce and characterize, namely JWS, of jointly-weakly-sticky programs, which contains WS. The selection functions for these last three classes are computable, and no external, possibly non-computable oracle is needed. We propose a bottom-up QA algorithm for programs in the class sch(S), for a general selection function S. As a particular case, we obtain a polynomial-time QA algorithm for JWS and weakly-sticky programs. Unlike WS, JWS turns out to be closed under magic-sets query optimization. As a consequence, both the generic polynomial-time QA algorithm and its magic-set optimization can be particularized and applied to WS.

AIJul 21, 2021

Answer-Set Programs for Reasoning about Counterfactual Interventions and Responsibility Scores for Classification

Leopoldo Bertossi, Gabriela Reyes

We describe how answer-set programs can be used to declaratively specify counterfactual interventions on entities under classification, and reason about them. In particular, they can be used to define and compute responsibility scores as attribution-based explanations for outcomes from classification models. The approach allows for the inclusion of domain knowledge and supports query answering. A detailed example with a naive-Bayes classifier is presented.

AIJun 19, 2021

Score-Based Explanations in Data Management and Machine Learning: An Answer-Set Programming Approach to Counterfactual Analysis

Leopoldo Bertossi

We describe some recent approaches to score-based explanations for query answers in databases and outcomes from classification models in machine learning. The focus is on work done by the author and collaborators. Special emphasis is placed on declarative approaches based on answer-set programming to the use of counterfactual reasoning for score specification and computation. Several examples that illustrate the flexibility of these methods are shown.

AIApr 16, 2021

On the Complexity of SHAP-Score-Based Explanations: Tractability via Knowledge Compilation and Non-Approximability Results

Marcelo Arenas, Pablo Barceló, Leopoldo Bertossi et al.

In Machine Learning, the $\mathsf{SHAP}$-score is a version of the Shapley value that is used to explain the result of a learned model on a specific entity by assigning a score to every feature. While in general computing Shapley values is an intractable problem, we prove a strong positive result stating that the $\mathsf{SHAP}$-score can be computed in polynomial time over deterministic and decomposable Boolean circuits. Such circuits are studied in the field of Knowledge Compilation and generalize a wide range of Boolean circuits and binary decision diagrams classes, including binary decision trees and Ordered Binary Decision Diagrams (OBDDs). We also establish the computational limits of the SHAP-score by observing that computing it over a class of Boolean models is always polynomially as hard as the model counting problem for that class. This implies that both determinism and decomposability are essential properties for the circuits that we consider. It also implies that computing $\mathsf{SHAP}$-scores is intractable as well over the class of propositional formulas in DNF. Based on this negative result, we look for the existence of fully-polynomial randomized approximation schemes (FPRAS) for computing $\mathsf{SHAP}$-scores over such class. In contrast to the model counting problem for DNF formulas, which admits an FPRAS, we prove that no such FPRAS exists for the computation of $\mathsf{SHAP}$-scores. Surprisingly, this negative result holds even for the class of monotone formulas in DNF. These techniques can be further extended to prove another strong negative result: Under widely believed complexity assumptions, there is no polynomial-time algorithm that checks, given a monotone DNF formula $\varphi$ and features $x,y$, whether the $\mathsf{SHAP}$-score of $x$ in $\varphi$ is smaller than the $\mathsf{SHAP}$-score of $y$ in $\varphi$.

AINov 15, 2020

Declarative Approaches to Counterfactual Explanations for Classification

Leopoldo Bertossi

We propose answer-set programs that specify and compute counterfactual interventions on entities that are input on a classification model. In relation to the outcome of the model, the resulting counterfactual entities serve as a basis for the definition and computation of causality-based explanation scores for the feature values in the entity under classification, namely "responsibility scores". The approach and the programs can be applied with black-box models, and also with models that can be specified as logic programs, such as rule-based classifiers. The main focus of this work is on the specification and computation of "best" counterfactual entities, i.e. those that lead to maximum responsibility scores. From them one can read off the explanations as maximum responsibility feature values in the original entity. We also extend the programs to bring into the picture semantic or domain knowledge. We show how the approach could be extended by means of probabilistic methods, and how the underlying probability distributions could be modified through the use of constraints. Several examples of programs written in the syntax of the DLV ASP-solver, and run with it, are shown.

DBJul 24, 2020

Score-Based Explanations in Data Management and Machine Learning

Leopoldo Bertossi

We describe some approaches to explanations for observed outcomes in data management and machine learning. They are based on the assignment of numerical scores to predefined and potentially relevant inputs. More specifically, we consider explanations for query answers in databases, and for results from classification models. The described approaches are mostly of a causal and counterfactual nature. We argue for the need to bring domain and semantic knowledge into score computations; and suggest some ways to do this.

LGApr 28, 2020

An ASP-Based Approach to Counterfactual Explanations for Classification

Leopoldo Bertossi

We propose answer-set programs that specify and compute counterfactual interventions as a basis for causality-based explanations to decisions produced by classification models. They can be applied with black-box models and models that can be specified as logic programs, such as rule-based classifiers. The main focus in on the specification and computation of maximum responsibility causal explanations. The use of additional semantic knowledge is investigated.

LGMar 15, 2020

Causality-based Explanation of Classification Outcomes

Leopoldo Bertossi, Jordan Li, Maximilian Schleich et al.

We propose a simple definition of an explanation for the outcome of a classifier based on concepts from causality. We compare it with previously proposed notions of explanation, and study their complexity. We conduct an experimental evaluation with two real datasets from the financial domain.

DBSep 27, 2018

Repair-Based Degrees of Database Inconsistency: Computation and Complexity

Leopoldo Bertossi

We propose a generic numerical measure of the inconsistency of a database with respect to a set of integrity constraints. It is based on an abstract repair semantics. In particular, an inconsistency measure associated to cardinality-repairs is investigated in detail. More specifically, it is shown that it can be computed via answer-set programs, but sometimes its computation can be intractable in data complexity. However, polynomial-time deterministic and randomized approximations are exhibited. The behavior of this measure under small updates is analyzed, obtaining fixed-parameter tractability results. Furthermore, alternative inconsistency measures are proposed and discussed.

DBApr 24, 2018

Measuring and Computing Database Inconsistency via Repairs

Leopoldo Bertossi

We propose a generic numerical measure of inconsistency of a database with respect to a set of integrity constraints. It is based on an abstract repair semantics. A particular inconsistency measure associated to cardinality-repairs is investigated; and we show that it can be computed via answer-set programs. Keywords: Integrity constraints in databases, inconsistent databases, database repairs, inconsistency measure.

DBMar 17, 2018

Datalog: Bag Semantics via Set Semantics

Leopoldo Bertossi, Georg Gottlob, Reinhard Pichler

Duplicates in data management are common and problematic. In this work, we present a translation of Datalog under bag semantics into a well-behaved extension of Datalog, the so-called {\em warded Datalog}$^\pm$, under set semantics. From a theoretical point of view, this allows us to reason on bag semantics by making use of the well-established theoretical foundations of set semantics. From a practical point of view, this allows us to handle the bag semantics of Datalog by powerful, existing query engines for the required extension of Datalog. This use of Datalog$^\pm$ is extended to give a set semantics to duplicates in Datalog$^\pm$ itself. We investigate the properties of the resulting Datalog$^\pm$ programs, the problem of deciding multiplicities, and expressibility of some bag operations. Moreover, the proposed translation has the potential for interesting applications such as to Multiset Relational Algebra and the semantic web query language SPARQL with bag semantics.

DBDec 4, 2017

Specifying and Computing Causes for Query Answers in Databases via Database Repairs and Repair Programs

Leopoldo Bertossi

A correspondence between database tuples as causes for query answers in databases and tuple-based repairs of inconsistent databases with respect to denial constraints has already been established. In this work, answer-set programs that specify repairs of databases are used as a basis for solving computational and reasoning problems about causes. Here, causes are also introduced at the attribute level by appealing to a both null-based and attribute-based repair semantics. The corresponding repair programs are presented, and they are used as a basis for computation and reasoning about attribute-level causes. They are extended to deal with the case of causality under integrity constraints.

DBApr 17, 2017

The Causality/Repair Connection in Databases: Causality-Programs

Leopoldo Bertossi

In this work, answer-set programs that specify repairs of databases are used as a basis for solving computational and reasoning problems about causes for query answers from databases.

DBApr 1, 2017

Ontological Multidimensional Data Models and Contextual Data Qality

Leopoldo Bertossi, Mostafa Milani

Data quality assessment and data cleaning are context-dependent activities. Motivated by this observation, we propose the Ontological Multidimensional Data Model (OMD model), which can be used to model and represent contexts as logic-based ontologies. The data under assessment is mapped into the context, for additional analysis, processing, and quality data extraction. The resulting contexts allow for the representation of dimensions, and multidimensional data quality assessment becomes possible. At the core of a multidimensional context we include a generalized multidimensional data model and a Datalog+/- ontology with provably good properties in terms of query answering. These main components are used to represent dimension hierarchies, dimensional constraints, dimensional rules, and define predicates for quality data specification. Query answering relies upon and triggers navigation through dimension hierarchies, and becomes the basic tool for the extraction of quality data. The OMD model is interesting per se, beyond applications to data quality. It allows for a logic-based, and computationally tractable representation of multidimensional data, extending previous multidimensional data models with additional expressive power and functionalities.

DBMar 10, 2017

The Ontological Multidimensional Data Model

Leopoldo Bertossi, Mostafa Milani

In this extended abstract we describe, mainly by examples, the main elements of the Ontological Multidimensional Data Model, which considerably extends a relational reconstruction of the multidimensional data model proposed by Hurtado and Mendelzon by means of tuple-generating dependencies, equality-generating dependencies, and negative constraints as found in Datalog+-. We briefly mention some good computational properties of the model.

DBNov 21, 2016

Enforcing Relational Matching Dependencies with Datalog for Entity Resolution

Zeinab Bahmani, Leopoldo Bertossi

Entity resolution (ER) is about identifying and merging records in a database that represent the same real-world entity. Matching dependencies (MDs) have been introduced and investigated as declarative rules that specify ER policies. An ER process induced by MDs over a dirty instance leads to multiple clean instances, in general. General "answer sets programs" have been proposed to specify the MD-based cleaning task and its results. In this work, we extend MDs to "relational MDs", which capture more application semantics, and identify classes of relational MDs for which the general ASP can be automatically rewritten into a stratified Datalog program, with the single clean instance as its standard model.

DBNov 6, 2016

Causes for Query Answers from Databases: Datalog Abduction, View-Updates, and Integrity Constraints

Leopoldo Bertossi, Babak Salimi

Causality has been recently introduced in databases, to model, characterize, and possibly compute causes for query answers. Connections between QA-causality and consistency-based diagnosis and database repairs (wrt. integrity constraint violations) have already been established. In this work we establish precise connections between QA-causality and both abductive diagnosis and the view-update problem in databases, allowing us to obtain new algorithmic and complexity results for QA-causality. We also obtain new results on the complexity of view-conditioned causality, and investigate the notion of QA-causality in the presence of integrity constraints, obtaining complexity results from a connection with view-conditioned causality. The abduction connection under integrity constraints allows us to obtain algorithmic tools for QA-causality.

DBJul 10, 2016

Extending Weakly-Sticky Datalog+/-: Query-Answering Tractability and Optimizations

Mostafa Milani, Leopoldo Bertossi

Weakly-sticky (WS) Datalog+/- is an expressive member of the family of Datalog+/- programs that is based on the syntactic notions of stickiness and weak-acyclicity. Query answering over the WS programs has been investigated, but there is still much work to do on the design and implementation of practical query answering (QA) algorithms and their optimizations. Here, we study sticky and WS programs from the point of view of the behavior of the chase procedure, extending the stickiness property of the chase to that of generalized stickiness of the chase (gsch-property). With this property we specify the semantic class of GSCh programs, which includes sticky and WS programs, and other syntactic subclasses that we identify. In particular, we introduce joint-weakly-sticky (JWS) programs, that include WS programs. We also propose a bottom-up QA algorithm for a range of subclasses of GSCh. The algorithm runs in polynomial time (in data) for JWS programs. Unlike the WS class, JWS is closed under a general magic-sets rewriting procedure for the optimization of programs with existential rules. We apply the magic-sets rewriting in combination with the proposed QA algorithm for the optimization of QA over JWS programs.

DBJun 6, 2016

Consistency and Trust in Peer Data Exchange Systems

Leopoldo Bertossi, Loreto Bravo

We propose and investigate a semantics for "peer data exchange systems" where different peers are related by data exchange constraints and trust relationships. These two elements plus the data at the peers' sites and their local integrity constraints are made compatible via a semantics that characterizes sets of "solution instances" for the peers. They are the intended -possibly virtual- instances for a peer that are obtained through a data repair semantics that we introduce and investigate. The semantically correct answers from a peer to a query, the so-called "peer consistent answers", are defined as those answers that are invariant under all its different solution instances. We show that solution instances can be specified as the models of logic programs with a stable model semantics. The repair semantics is based on null values as used in SQL databases, and is also of independent interest for repairs of single databases with respect to integrity constraints.

DBApr 22, 2016

A Hybrid Approach to Query Answering under Expressive Datalog+/-

Mostafa Milani, Andrea Cali, Leopoldo Bertossi

Datalog+/- is a family of ontology languages that combine good computational properties with high expressive power. Datalog+/- languages are provably able to capture the most relevant Semantic Web languages. In this paper we consider the class of weakly-sticky (WS) Datalog+/- programs, which allow for certain useful forms of joins in rule bodies as well as extending the well-known class of weakly-acyclic TGDs. So far, only non-deterministic algorithms were known for answering queries on WS Datalog+/- programs. We present novel deterministic query answering algorithms under WS Datalog+/-. In particular, we propose: (1) a bottom-up grounding algorithm based on a query-driven chase, and (2) a hybrid approach based on transforming a WS program into a so-called sticky one, for which query rewriting techniques are known. We discuss how our algorithms can be optimized and effectively applied for query answering in real-world scenarios.

DBFeb 20, 2016

Causes for Query Answers from Databases, Datalog Abduction and View-Updates: The Presence of Integrity Constraints

Babak Salimi, Leopoldo Bertossi

Causality has been recently introduced in databases, to model, characterize and possibly compute causes for query results (answers). Connections between queryanswer causality, consistency-based diagnosis, database repairs (wrt. integrity constraint violations), abductive diagnosis and the view-update problem have been established. In this work we further investigate connections between query-answer causality and abductive diagnosis and the view-update problem. In this context, we also define and investigate the notion of query-answer causality in the presence of integrity constraints.

DBFeb 7, 2016

ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

Zeinab Bahmani, Leopoldo Bertossi, Nikolaos Vasiloglou

Entity resolution (ER), an important and common data cleaning problem, is about detecting data duplicate representations for the same external entities, and merging them into single representations. Relatively recently, declarative rules called "matching dependencies" (MDs) have been proposed for specifying similarity conditions under which attribute values in database records are merged. In this work we show the process and the benefits of integrating four components of ER: (a) Building a classifier for duplicate/non-duplicate record pairs built using machine learning (ML) techniques; (b) Use of MDs for supporting the blocking phase of ML; (c) Record merging on the basis of the classifier results; and (d) The use of the declarative language "LogiQL" -an extended form of Datalog supported by the "LogicBlox" platform- for all activities related to data processing, and the specification and enforcement of MDs.

DBAug 25, 2015

ERBlox: Combining Matching Dependencies with Machine Learning for Entity Resolution

Zeinab Bahmani, Leopoldo Bertossi, Nikolaos Vasiloglou

Entity resolution (ER), an important and common data cleaning problem, is about detecting data duplicate representations for the same external entities, and merging them into single representations. Relatively recently, declarative rules called matching dependencies (MDs) have been proposed for specifying similarity conditions under which attribute values in database records are merged. In this work we show the process and the benefits of integrating three components of ER: (a) Classifiers for duplicate/non-duplicate record pairs built using machine learning (ML) techniques, (b) MDs for supporting both the blocking phase of ML and the merge itself; and (c) The use of the declarative language LogiQL -an extended form of Datalog supported by the LogicBlox platform- for data processing, and the specification and enforcement of MDs.

DBJul 1, 2015

From Causes for Database Queries to Repairs and Model-Based Diagnosis and Back

Leopoldo Bertossi, Babak Salimi

In this work we establish and investigate connections between causes for query answers in databases, database repairs wrt. denial constraints, and consistency-based diagnosis. The first two are relatively new research areas in databases, and the third one is an established subject in knowledge representation. We show how to obtain database repairs from causes, and the other way around. Causality problems are formulated as diagnosis problems, and the diagnoses provide causes and their responsibilities. The vast body of research on database repairs can be applied to the newer problems of computing actual causes for query answers and their responsibilities. These connections, which are interesting per se, allow us, after a transition -inspired by consistency-based diagnosis- to computational problems on hitting sets and vertex covers in hypergraphs, to obtain several new algorithmic and complexity results for database causality.

DBJun 13, 2015

Query-Answer Causality in Databases: Abductive Diagnosis and View-Updates

Babak Salimi, Leopoldo Bertossi

Causality has been recently introduced in databases, to model, characterize and possibly compute causes for query results (answers). Connections between query causality and consistency-based diagnosis and database repairs (wrt. integrity constrain violations) have been established in the literature. In this work we establish connections between query causality and abductive diagnosis and the view-update problem. The unveiled relationships allow us to obtain new complexity results for query causality -the main focus of our work- and also for the two other areas.

DBApr 13, 2015

Tractable Query Answering and Optimization for Extensions of Weakly-Sticky Datalog+-

Mostafa Milani, Leopoldo Bertossi

We consider a semantic class, weakly-chase-sticky (WChS), and a syntactic subclass, jointly-weakly-sticky (JWS), of Datalog+- programs. Both extend that of weakly-sticky (WS) programs, which appear in our applications to data quality. For WChS programs we propose a practical, polynomial-time query answering algorithm (QAA). We establish that the two classes are closed under magic-sets rewritings. As a consequence, QAA can be applied to the optimized programs. QAA takes as inputs the program (including the query) and semantic information about the "finiteness" of predicate positions. For the syntactic subclasses JWS and WS of WChS, this additional information is computable.