LGSep 4, 2023
CONFIDERAI: a novel CONFormal Interpretable-by-Design score function for Explainable and Reliable Artificial IntelligenceSara Narteni, Alberto Carlevaro, Fabrizio Dabbene et al.
Everyday life is increasingly influenced by artificial intelligence, and there is no question that machine learning algorithms must be designed to be reliable and trustworthy for everyone. Specifically, computer scientists consider an artificial intelligence system safe and trustworthy if it fulfills five pillars: explainability, robustness, transparency, fairness, and privacy. In addition to these five, we propose a sixth fundamental aspect: conformity, that is, the probabilistic assurance that the system will behave as the machine learner expects. In this paper, we present a methodology to link conformal prediction with explainable machine learning by defining a new score function for rule-based classifiers that leverages rules predictive ability, the geometrical position of points within rules boundaries and the overlaps among rules as well, thanks to the definition of a geometrical rule similarity term. Furthermore, we address the problem of defining regions in the feature space where conformal guarantees are satisfied, by exploiting the definition of conformal critical set and showing how this set can be used to achieve new rules with improved performance on the target class. The overall methodology is tested with promising results on several datasets of real-world interest, such as domain name server tunneling detection or cardiovascular disease prediction.
MLSep 8, 2023
Probabilistic Safety Regions Via Finite Families of Scalable ClassifiersAlberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene et al.
Supervised classification recognizes patterns in the data to separate classes of behaviours. Canonical solutions contain misclassification errors that are intrinsic to the numerical approximating nature of machine learning. The data analyst may minimize the classification error on a class at the expense of increasing the error of the other classes. The error control of such a design phase is often done in a heuristic manner. In this context, it is key to develop theoretical foundations capable of providing probabilistic certifications to the obtained classifiers. In this perspective, we introduce the concept of probabilistic safety region to describe a subset of the input space in which the number of misclassified instances is probabilistically controlled. The notion of scalable classifiers is then exploited to link the tuning of machine learning with error control. Several tests corroborate the approach. They are provided through synthetic data in order to highlight all the steps involved, as well as through a smart mobility application.
MLJan 29, 2025Code
Exact characterization of ε-Safe Decision Regions for exponential family distributions and Multi Cost SVM approximationAlberto Carlevaro, Teodoro Alamo, Fabrizio Dabbene et al.
Probabilistic guarantees on the prediction of data-driven classifiers are necessary to define models that can be considered reliable. This is a key requirement for modern machine learning in which the goodness of a system is measured in terms of trustworthiness, clearly dividing what is safe from what is unsafe. The spirit of this paper is exactly in this direction. First, we introduce a formal definition of ε-Safe Decision Region, a subset of the input space in which the prediction of a target (safe) class is probabilistically guaranteed. Second, we prove that, when data come from exponential family distributions, the form of such a region is analytically determined and controllable by design parameters, i.e. the probability of sampling the target class and the confidence on the prediction. However, the request of having exponential data is not always possible. Inspired by this limitation, we developed Multi Cost SVM, an SVM based algorithm that approximates the safe region and is also able to handle unbalanced data. The research is complemented by experiments and code available for reproducibility.
MLMar 15, 2024
Conformal Predictions for Probabilistically Robust Scalable Machine Learning ClassificationAlberto Carlevaro, Teodoro Alamo Cantarero, Fabrizio Dabbene et al.
Conformal predictions make it possible to define reliable and robust learning algorithms. But they are essentially a method for evaluating whether an algorithm is good enough to be used in practice. To define a reliable learning framework for classification from the very beginning of its design, the concept of scalable classifier was introduced to generalize the concept of classical classifier by linking it to statistical order theory and probabilistic learning theory. In this paper, we analyze the similarities between scalable classifiers and conformal predictions by introducing a new definition of a score function and defining a special set of input variables, the conformal safety set, which can identify patterns in the input space that satisfy the error coverage guarantee, i.e., that the probability of observing the wrong (possibly unsafe) label for points belonging to this set is bounded by a predefined $\varepsilon$ error level. We demonstrate the practical implications of this framework through an application in cybersecurity for identifying DNS tunneling attacks. Our work contributes to the development of probabilistically robust and reliable machine learning models.
ROFeb 14, 2025
Safe and Efficient Social Navigation through Explainable Safety Regions Based on Topological FeaturesVictor Toscano-Duran, Sara Narteni, Alberto Carlevaro et al.
The recent adoption of artificial intelligence in robotics has driven the development of algorithms that enable autonomous systems to adapt to complex social environments. In particular, safe and efficient social navigation is a key challenge, requiring AI not only to avoid collisions and deadlocks but also to interact intuitively and predictably with its surroundings. Methods based on probabilistic models and the generation of conformal safety regions have shown promising results in defining safety regions with a controlled margin of error, primarily relying on classification approaches and explicit rules to describe collision-free navigation conditions. This work extends the existing perspective by investigating how topological features can contribute to the creation of explainable safety regions in social navigation scenarios, enabling the classification and characterization of different simulation behaviors. Rather than relying on behaviors parameters to generate safety regions, we leverage topological features through topological data analysis. We first utilize global rule-based classification to provide interpretable characterizations of different simulation behaviors, distinguishing between safe and unsafe scenarios based on topological properties. Next, we define safety regions, $S_\varepsilon$, representing zones in the topological feature space where collisions are avoided with a maximum classification error of $\varepsilon$. These regions are constructed using adjustable SVM classifiers and order statistics, ensuring a robust and scalable decision boundary. Our approach initially separates simulations with and without collisions, outperforming methods that not incorporate topological features. We further refine safety regions to ensure deadlock-free simulations and integrate both aspects to define a compliant simulation space that guarantees safe and efficient navigation.