AIMar 23, 2023
Extended High Utility Pattern Mining: An Answer Set Programming Based Framework and ApplicationsFrancesco Cauteruccio, Giorgio Terracina
Detecting sets of relevant patterns from a given dataset is an important challenge in data mining. The relevance of a pattern, also called utility in the literature, is a subjective measure and can be actually assessed from very different points of view. Rule-based languages like Answer Set Programming (ASP) seem well suited for specifying user-provided criteria to assess pattern utility in a form of constraints; moreover, declarativity of ASP allows for a very easy switch between several criteria in order to analyze the dataset from different points of view. In this paper, we make steps toward extending the notion of High Utility Pattern Mining (HUPM); in particular we introduce a new framework that allows for new classes of utility criteria not considered in the previous literature. We also show how recent extensions of ASP with external functions can support a fast and effective encoding and testing of the new framework. To demonstrate the potential of the proposed framework, we exploit it as a building block for the definition of an innovative method for predicting ICU admission for COVID-19 patients. Finally, an extensive experimental activity demonstrates both from a quantitative and a qualitative point of view the effectiveness of the proposed approach. Under consideration in Theory and Practice of Logic Programming (TPLP)
SEApr 19
Project resilience as network robustnessSebastiano A. Piccolo, Giorgio Terracina
Engineering projects are the result of the combined effort of their members. Yet, it has been documented that labor division withing projects is unevenly distributed: some project members are specialists undertaking only few tasks, whereas other are generalists and are responsible for the success of many tasks. Moreover, the latter are often facilitators of project integration. Such a workload distribution prompts one question: how resilient is a project to key personnel loss? Far from being a theoretical problem, the reliance of a project on a few key people can lead to severe economic losses and delays. We argue that current methods to estimate such a risk are unsatisfactory: some methods offer a best-case estimate and are, therefore, too optimistic; other methods fail to capture project fragmentation leading to biased estimates and unrealistic consequences in many settings. In this paper, we develop a novel method to assess project vulnerability by looking at it from the lens of network robustness. We compare our method against existing alternatives and show that it offers better and more consistent estimates of project resilience to personnel loss.
SIMar 8
The Theory and Practice of Computing the Bus-FactorSebastiano A. Piccolo, Pasquale De Meo, Giorgio Terracina et al.
The bus-factor is a measure of project risk with respect to personnel availability, informally defined as the number of people whose sudden unavailability would cause a project to stall or experience severe delays. Despite its intuitive appeal, existing bus-factor measures rely on heterogeneous modeling assumptions, ambiguous definitions of failure, and domain-specific artifacts, limiting their generality, comparability, and ability to capture project fragmentation. In this paper, we develop a unified, domain-agnostic framework for bus-factor estimation by modeling projects as bipartite graphs of people and tasks and casting the computation of the bus-factor as a family of combinatorial optimization problems. Within this framework, we formalize and reconcile two complementary interpretations of the bus-factor, redundancy and criticality, corresponding to the Maximum Redundant Set and the Minimum Critical Set, respectively, and prove that both formulations are NP-hard. Building on this theoretical foundation, we introduce a novel bus-factor measure inspired by network robustness. Unlike prior approaches, the proposed measure captures both loss of coverage and increasing project fragmentation by tracking the largest connected set of tasks under progressive contributor removal. The resulting measure is normalized, threshold-free, and applicable across domains; we show that its exact computation is NP-hard as well. We further propose efficient linear-time approximation algorithms for all considered measures. Finally, we evaluate their behavior through a sensitivity analysis based on controlled perturbations of project structures, guided by expectations derived from project management theory. Our results show that the robustness-based measure behaves consistently with these expectations and provides a more informative and stable assessment of project risk than existing alternatives.
AIOct 21, 2019
A Logic-Based Framework Leveraging Neural Networks for Studying the Evolution of Neurological DisordersFrancesco Calimeri, Francesco Cauteruccio, Luca Cinelli et al.
Deductive formalisms have been strongly developed in recent years; among them, Answer Set Programming (ASP) gained some momentum, and has been lately fruitfully employed in many real-world scenarios. Nonetheless, in spite of a large number of success stories in relevant application areas, and even in industrial contexts, deductive reasoning cannot be considered the ultimate, comprehensive solution to AI; indeed, in several contexts, other approaches result to be more useful. Typical Bioinformatics tasks, for instance classification, are currently carried out mostly by Machine Learning (ML) based solutions. In this paper, we focus on the relatively new problem of analyzing the evolution of neurological disorders. In this context, ML approaches already demonstrated to be a viable solution for classification tasks; here, we show how ASP can play a relevant role in the brain evolution simulation task. In particular, we propose a general and extensible framework to support physicians and researchers at understanding the complex mechanisms underlying neurological disorders. The framework relies on a combined use of ML and ASP, and is general enough to be applied in several other application scenarios, which are outlined in the paper.
AIJul 22, 2015
Taming Primary Key Violations to Query Large Inconsistent DataMarco Manna, Francesco Ricca, Giorgio Terracina
Consistent query answering over a database that violates primary key constraints is a classical hard problem in database research that has been traditionally dealt with logic programming. However, the applicability of existing logic-based solutions is restricted to data sets of moderate size. This paper presents a novel decomposition and pruning strategy that reduces, in polynomial time, the problem of computing the consistent answer to a conjunctive query over a database subject to primary key constraints to a collection of smaller problems of the same sort that can be solved independently. The new strategy is naturally modeled and implemented using Answer Set Programming (ASP). An experiment run on benchmarks from the database world prove the effectiveness and efficiency of our ASP-based approach also on large data sets. To appear in Theory and Practice of Logic Programming (TPLP), Proceedings of ICLP 2015.