LGJan 28Code
GNN Explanations that do not Explain and How to find ThemSteve Azzolin, Stefano Teso, Bruno Lepri et al.
Explanations provided by Self-explainable Graph Neural Networks (SE-GNNs) are fundamental for understanding the model's inner workings and for identifying potential misuse of sensitive attributes. Although recent works have highlighted that these explanations can be suboptimal and potentially misleading, a characterization of their failure cases is unavailable. In this work, we identify a critical failure of SE-GNN explanations: explanations can be unambiguously unrelated to how the SE-GNNs infer labels. We show that, on the one hand, many SE-GNNs can achieve optimal true risk while producing these degenerate explanations, and on the other, most faithfulness metrics can fail to identify these failure modes. Our empirical analysis reveals that degenerate explanations can be maliciously planted (allowing an attacker to hide the use of sensitive attributes) and can also emerge naturally, highlighting the need for reliable auditing. To address this, we introduce a novel faithfulness metric that reliably marks degenerate explanations as unfaithful, in both malicious and natural settings. Our code is available in the supplemental.
LGAug 24, 2022
Deep Symbolic Learning: Discovering Symbols and Rules from PerceptionsAlessandro Daniele, Tommaso Campari, Sagar Malhotra et al.
Neuro-Symbolic (NeSy) integration combines symbolic reasoning with Neural Networks (NNs) for tasks requiring perception and reasoning. Most NeSy systems rely on continuous relaxation of logical knowledge, and no discrete decisions are made within the model pipeline. Furthermore, these methods assume that the symbolic rules are given. In this paper, we propose Deep Symbolic Learning (DSL), a NeSy system that learns NeSy-functions, i.e., the composition of a (set of) perception functions which map continuous data to discrete symbols, and a symbolic function over the set of symbols. DSL learns simultaneously the perception and symbolic functions while being trained only on their composition (NeSy-function). The key novelty of DSL is that it can create internal (interpretable) symbolic representations and map them to perception inputs within a differentiable NN learning pipeline. The created symbols are automatically selected to generate symbolic functions that best explain the data. We provide experimental analysis to substantiate the efficacy of DSL in simultaneously learning perception and symbolic functions.
AIApr 8, 2022
On Projectivity in Markov Logic NetworksSagar Malhotra, Luciano Serafini
Markov Logic Networks (MLNs) define a probability distribution on relational structures over varying domain sizes. Many works have noticed that MLNs, like many other relational models, do not admit consistent marginal inference over varying domain sizes. Furthermore, MLNs learnt on a certain domain do not generalize to new domains of varied sizes. In recent works, connections have emerged between domain size dependence, lifted inference and learning from sub-sampled domains. The central idea to these works is the notion of projectivity. The probability distributions ascribed by projective models render the marginal probabilities of sub-structures independent of the domain cardinality. Hence, projective models admit efficient marginal inference, removing any dependence on the domain size. Furthermore, projective models potentially allow efficient and consistent parameter learning from sub-sampled domains. In this paper, we characterize the necessary and sufficient conditions for a two-variable MLN to be projective. We then isolate a special model in this class of MLNs, namely Relational Block Model (RBM). We show that, in terms of data likelihood maximization, RBM is the best possible projective MLN in the two-variable fragment. Finally, we show that RBMs also admit consistent parameter learning over sub-sampled domains.
AIAug 22, 2023
Lifted Inference beyond First-Order LogicSagar Malhotra, Davide Bizzaro, Luciano Serafini
Weighted First Order Model Counting (WFOMC) is fundamental to probabilistic inference in statistical relational learning models. As WFOMC is known to be intractable in general ($\#$P-complete), logical fragments that admit polynomial time WFOMC are of significant interest. Such fragments are called domain liftable. Recent works have shown that the two-variable fragment of first order logic extended with counting quantifiers ($\mathrm{C^2}$) is domain-liftable. However, many properties of real-world data, like acyclicity in citation networks and connectivity in social networks, cannot be modeled in $\mathrm{C^2}$, or first order logic in general. In this work, we expand the domain liftability of $\mathrm{C^2}$ with multiple such properties. We show that any $\mathrm{C^2}$ sentence remains domain liftable when one of its relations is restricted to represent a directed acyclic graph, a connected graph, a tree (resp. a directed tree) or a forest (resp. a directed forest). All our results rely on a novel and general methodology of "counting by splitting". Besides their application to probabilistic inference, our results provide a general framework for counting combinatorial structures. We expand a vast array of previous results in discrete mathematics literature on directed acyclic graphs, phylogenetic networks, etc.
AIFeb 20, 2023
Weighted First Order Model Counting with Directed Acyclic Graph AxiomsSagar Malhotra, Luciano Serafini
Statistical Relational Learning (SRL) integrates First-Order Logic (FOL) and probability theory for learning and inference over relational data. Probabilistic inference and learning in many SRL models can be reduced to Weighted First Order Model Counting (WFOMC). However, WFOMC is known to be intractable ($\mathrm{\#P_1-}$ complete). Hence, logical fragments that admit polynomial time WFOMC are of significant interest. Such fragments are called domain liftable. Recent line of works have shown the two-variable fragment of FOL, extended with counting quantifiers ($\mathrm{C^2}$) to be domain-liftable. However, many properties of real-world data can not be modelled in $\mathrm{C^2}$. In fact many ubiquitous properties of real-world data are inexressible in FOL. Acyclicity is one such property, found in citation networks, genealogy data, temporal data e.t.c. In this paper we aim to address this problem by investigating the domain liftability of directed acyclicity constraints. We show that the fragment $\mathrm{C^2}$ with a Directed Acyclic Graph (DAG) axiom, i.e., a predicate in the language is axiomatized to represent a DAG, is domain-liftable. We present a method based on principle of inclusion-exclusion for WFOMC of $\mathrm{C^2}$ formulas extended with DAG axioms.
LGJan 23
Is BatchEnsemble a Single Model? On Calibration and Diversity of Efficient EnsemblesAnton Zamyatin, Patrick Indri, Sagar Malhotra et al.
In resource-constrained and low-latency settings, uncertainty estimates must be efficiently obtained. Deep Ensembles provide robust epistemic uncertainty (EU) but require training multiple full-size models. BatchEnsemble aims to deliver ensemble-like EU at far lower parameter and memory cost by applying learned rank-1 perturbations to a shared base network. We show that BatchEnsemble not only underperforms Deep Ensembles but closely tracks a single model baseline in terms of accuracy, calibration and out-of-distribution (OOD) detection on CIFAR10/10C/SVHN. A controlled study on MNIST finds members are near-identical in function and parameter space, indicating limited capacity to realize distinct predictive modes. Thus, BatchEnsemble behaves more like a single model than a true ensemble.
LGNov 9, 2025
Probably Approximately Global Robustness CertificationPeter Blohm, Patrick Indri, Thomas Gärtner et al.
We propose and investigate probabilistic guarantees for the adversarial robustness of classification algorithms. While traditional formal verification approaches for robustness are intractable and sampling-based approaches do not provide formal guarantees, our approach is able to efficiently certify a probabilistic relaxation of robustness. The key idea is to sample an $ε$-net and invoke a local robustness oracle on the sample. Remarkably, the size of the sample needed to achieve probably approximately global robustness guarantees is independent of the input dimensionality, the number of classes, and the learning algorithm itself. Our approach can, therefore, be applied even to large neural networks that are beyond the scope of traditional formal verification. Experiments empirically confirm that it characterizes robustness better than state-of-the-art sampling-based approaches and scales better than formal methods.
LGFeb 4, 2025
Beyond Topological Self-Explainable GNNs: A Formal Explainability PerspectiveSteve Azzolin, Sagar Malhotra, Andrea Passerini et al.
Self-Explainable Graph Neural Networks (SE-GNNs) are popular explainable-by-design GNNs, but their explanations' properties and limitations are not well understood. Our first contribution fills this gap by formalizing the explanations extracted by some popular SE-GNNs, referred to as Minimal Explanations (MEs), and comparing them to established notions of explanations, namely Prime Implicant (PI) and faithful explanations. Our analysis reveals that MEs match PI explanations for a restricted but significant family of tasks. In general, however, they can be less informative than PI explanations and are surprisingly misaligned with widely accepted notions of faithfulness. Although faithful and PI explanations are informative, they are intractable to find and we show that they can be prohibitively large. Given these observations, a natural choice is to augment SE-GNNs with alternative modalities of explanations taking care of SE-GNNs' limitations. To this end, we propose Dual-Channel GNNs that integrate a white-box rule extractor and a standard SE-GNN, adaptively combining both channels. Our experiments show that even a simple instantiation of Dual-Channel GNNs can recover succinct rules and perform on par or better than widely used SE-GNNs.
LGFeb 21, 2024
Simple and Effective Transfer Learning for Neuro-Symbolic IntegrationAlessandro Daniele, Tommaso Campari, Sagar Malhotra et al.
Deep Learning (DL) techniques have achieved remarkable successes in recent years. However, their ability to generalize and execute reasoning tasks remains a challenge. A potential solution to this issue is Neuro-Symbolic Integration (NeSy), where neural approaches are combined with symbolic reasoning. Most of these methods exploit a neural network to map perceptions to symbols and a logical reasoner to predict the output of the downstream task. These methods exhibit superior generalization capacity compared to fully neural architectures. However, they suffer from several issues, including slow convergence, learning difficulties with complex perception tasks, and convergence to local minima. This paper proposes a simple yet effective method to ameliorate these problems. The key idea involves pretraining a neural model on the downstream task. Then, a NeSy model is trained on the same task via transfer learning, where the weights of the perceptual part are injected from the pretrained network. The key observation of our work is that the neural network fails to generalize only at the level of the symbolic part while being perfectly capable of learning the mapping from perceptions to symbols. We have tested our training strategy on various SOTA NeSy methods and datasets, demonstrating consistent improvements in the aforementioned problems.
DMOct 24, 2025
On Local Limits of Sparse Random Graphs: Color Convergence and the Refined Configuration ModelAlexander Pluska, Sagar Malhotra
Local convergence has emerged as a fundamental tool for analyzing sparse random graph models. We introduce a new notion of local convergence, color convergence, based on the Weisfeiler-Leman algorithm. Color convergence fully characterizes the class of random graphs that are well-behaved in the limit for message-passing graph neural networks. Building on this, we propose the Refined Configuration Model (RCM), a random graph model that generalizes the configuration model. The RCM is universal with respect to local convergence among locally tree-like random graph models, including Erdős-Rényi, stochastic block and configuration models. Finally, this framework enables a complete characterization of the random trees that arise as local limits of such graphs.
LGOct 10, 2025
Prime Implicant Explanations for Reaction Feasibility PredictionKlaus Weinbauer, Tieu-Long Phan, Peter F. Stadler et al.
Machine learning models that predict the feasibility of chemical reactions have become central to automated synthesis planning. Despite their predictive success, these models often lack transparency and interpretability. We introduce a novel formulation of prime implicant explanations--also known as minimally sufficient reasons--tailored to this domain, and propose an algorithm for computing such explanations in small-scale reaction prediction tasks. Preliminary experiments demonstrate that our notion of prime implicant explanations conservatively captures the ground truth explanations. That is, such explanations often contain redundant bonds and atoms but consistently capture the molecular attributes that are essential for predicting reaction feasibility.
LGJun 11, 2024
Logical Distillation of Graph Neural NetworksAlexander Pluska, Pascal Welke, Thomas Gärtner et al.
We present a logic based interpretable model for learning on graphs and an algorithm to distill this model from a Graph Neural Network (GNN). Recent results have shown connections between the expressivity of GNNs and the two-variable fragment of first-order logic with counting quantifiers (C2). We introduce a decision-tree based model which leverages an extension of C2 to distill interpretable logical classifiers from GNNs. We test our approach on multiple GNN architectures. The distilled models are interpretable, succinct, and attain similar accuracy to the underlying GNN. Furthermore, when the ground truth is expressible in C2, our approach outperforms the GNN.
AIMar 23, 2024
Understanding Domain-Size Generalization in Markov Logic NetworksFlorian Chen, Felix Weitkämper, Sagar Malhotra
We study the generalization behavior of Markov Logic Networks (MLNs) across relational structures of different sizes. Multiple works have noticed that MLNs learned on a given domain generalize poorly across domains of different sizes. This behavior emerges from a lack of internal consistency within an MLN when used across different domain sizes. In this paper, we quantify this inconsistency and bound it in terms of the variance of the MLN parameters. The parameter variance also bounds the KL divergence between an MLN's marginal distributions taken from different domain sizes. We use these bounds to show that maximizing the data log-likelihood while simultaneously minimizing the parameter variance corresponds to two natural notions of generalization across domain sizes. Our theoretical results apply to Exponential Random Graphs and other Markov network based relational models. Finally, we observe that solutions known to decrease the variance of the MLN parameters, like regularization and Domain-Size Aware MLNs, increase the internal consistency of the MLNs. We empirically verify our results on four different datasets, with different methods to control parameter variance, showing that controlling parameter variance leads to better generalization.
LOOct 12, 2021
Weighted Model Counting in FO2 with Cardinality Constraints and Counting Quantifiers: A Closed Form FormulaSagar Malhotra, Luciano Serafini
Weighted First-Order Model Counting (WFOMC) computes the weighted sum of the models of a first-order logic theory on a given finite domain. First-Order Logic theories that admit polynomial-time WFOMC w.r.t domain cardinality are called domain liftable. We introduce the concept of lifted interpretations as a tool for formulating closed-forms for WFOMC. Using lifted interpretations, we reconstruct the closed-form formula for polynomial-time FOMC in the universally quantified fragment of FO2, earlier proposed by Beame et al. We then expand this closed-form to incorporate cardinality constraints, existential quantifiers, and counting quantifiers (a.k.a C2) without losing domain-liftability. Finally, we show that the obtained closed-form motivates a natural definition of a family of weight functions strictly larger than symmetric weight functions.
AISep 25, 2020
Weighted Model Counting in the two variable fragment with Cardinality Constraints: A Closed Form FormulaSagar Malhotra, Luciano Serafini
Weighted First-Order Model Counting (WFOMC) computes the weighted sum of the models of a first-order theory on a given finite domain. WFOMC has emerged as a fundamental tool for probabilistic inference. Algorithms for WFOMC that run in polynomial time w.r.t. the domain size are called lifted inference algorithms. Such algorithms have been developed for multiple extensions of FO2(the fragment of first-order logic with two variables) for the special case of symmetric weight functions. We introduce the concept of lifted interpretations as a tool for formulating polynomials for WFOMC. Using lifted interpretations, we reconstruct the closed-form formula for polynomial-time FOMC in the universal fragment of FO2, earlier proposed by Beame et al. We then expand this closed-form to incorporate existential quantifiers and cardinality constraints without losing domain-liftability. Finally, we show that the obtained closed-form motivates a natural definition of a family of weight functions strictly larger than symmetric weight functions.