Giulia Preti

h-index5

3papers

77citations

3 Papers

5.3LGJul 27, 2023Code

Counterfactual Explanations for Graph Classification Through the Lenses of Density

Carlo Abrate, Giulia Preti, Francesco Bonchi

Counterfactual examples have emerged as an effective approach to produce simple and understandable post-hoc explanations. In the context of graph classification, previous work has focused on generating counterfactual explanations by manipulating the most elementary units of a graph, i.e., removing an existing edge, or adding a non-existing one. In this paper, we claim that such language of explanation might be too fine-grained, and turn our attention to some of the main characterizing features of real-world complex networks, such as the tendency to close triangles, the existence of recurring motifs, and the organization into dense modules. We thus define a general density-based counterfactual search framework to generate instance-level counterfactual explanations for graph classifiers, which can be instantiated with different notions of dense substructures. In particular, we show two specific instantiations of this general framework: a method that searches for counterfactual graphs by opening or closing triangles, and a method driven by maximal cliques. We also discuss how the general method can be instantiated to exploit any other notion of dense substructures, including, for instance, a given taxonomy of nodes. We evaluate the effectiveness of our approaches in 7 brain network datasets and compare the counterfactual statements generated according to several widely-used metrics. Results confirm that adopting a semantic-relevant unit of change like density is essential to define versatile and interpretable counterfactual explanation methods.

1.2SIJun 11, 2025Code

Alice and the Caterpillar: A more descriptive null model for assessing data mining results

Giulia Preti, Gianmarco De Francisci Morales, Matteo Riondato

We introduce novel null models for assessing the results obtained from observed binary transactional and sequence datasets, using statistical hypothesis testing. Our null models maintain more properties of the observed dataset than existing ones. Specifically, they preserve the Bipartite Joint Degree Matrix of the bipartite (multi-)graph corresponding to the dataset, which ensures that the number of caterpillars, i.e., paths of length three, is preserved, in addition to other properties considered by other models. We describe Alice, a suite of Markov chain Monte Carlo algorithms for sampling datasets from our null models, based on a carefully defined set of states and efficient operations to move between them. The results of our experimental evaluation show that Alice mixes fast and scales well, and that our null model finds different significant results than ones previously considered in the literature.

1.7SIJun 5

HyDRA: Lossless Hypergraph Summarization via Co-Clustering

Giulia Preti, Aris Anagnostopoulos, Francesco Bonchi

Hypergraphs are a powerful representation for higher-order interactions but their scale and complexity pose significant data management and analysis challenges. While summarization techniques are widely used to distill simple graphs, lossless summarization for hypergraphs remains unexplored. We introduce HyDRA, the first formal framework for lossless summarization of weighted hypergraphs. In our framework, a summary is a new weighted hypergraph composed of supernodes (groups of nodes) and superhyperedges (groups of hyperedges), paired with a correction table for exact reconstruction. By establishing a conceptual link to co-clustering, we design an efficient, parameter-free greedy algorithm that iteratively merges node and hyperedge clusters to minimize a novel storage-aware cost function. HyDRA employs an incremental update strategy to prevent the costly recomputation of the correction table at each step. Extensive experiments demonstrate that \our achieves a substantial reduction in storage cost (80-93% in some settings, depending on the hypergraph characteristics). Because the resulting summaries are themselves hypergraphs, they can be queried directly, providing fast and accurate approximate answers for various connectivity and centrality queries, and accelerating downstream tasks such as influence maximization.