DB LGSep 11, 2025

Database Views as Explanations for Relational Deep Learning

Agapi Rissaki, Ilias Fountalis, Wolfgang Gatterbauer, Benny Kimelfeld

arXiv:2509.09482v11.2h-index: 2

Originality Incremental advance

AI Analysis

This addresses the interpretability challenge for users of relational deep learning models, offering a novel explanation method that is incremental in adapting existing determinacy concepts.

The paper tackles the problem of explaining deep learning models over relational databases by proposing a framework that uses database views as global abductive explanations, highlighting key parts of the database contributing to predictions, and demonstrates its usefulness and efficiency through an empirical study on the RelBench collection.

In recent years, there has been significant progress in the development of deep learning models over relational databases, including architectures based on heterogeneous graph neural networks (hetero-GNNs) and heterogeneous graph transformers. In effect, such architectures state how the database records and links (e.g., foreign-key references) translate into a large, complex numerical expression, involving numerous learnable parameters. This complexity makes it hard to explain, in human-understandable terms, how a model uses the available data to arrive at a given prediction. We present a novel framework for explaining machine-learning models over relational databases, where explanations are view definitions that highlight focused parts of the database that mostly contribute to the model's prediction. We establish such global abductive explanations by adapting the classic notion of determinacy by Nash, Segoufin, and Vianu (2010). In addition to tuning the tradeoff between determinacy and conciseness, the framework allows controlling the level of granularity by adopting different fragments of view definitions, such as ones highlighting whole columns, foreign keys between tables, relevant groups of tuples, and so on. We investigate the realization of the framework in the case of hetero-GNNs. We develop heuristic algorithms that avoid the exhaustive search over the space of all databases. We propose techniques that are model-agnostic, and others that are tailored to hetero-GNNs via the notion of learnable masking. Our approach is evaluated through an extensive empirical study on the RelBench collection, covering a variety of domains and different record-level tasks. The results demonstrate the usefulness of the proposed explanations, as well as the efficiency of their generation.

View on arXiv PDF

Similar