AIJun 3
Learning Admissible Heuristics via Cost PartitioningHugo Barral, Quentin Cappart, Marie-José Huguet et al.
Admissible heuristics are essential for optimal planning, yet learning them remains challenging due to the risk of overestimation. Cost partitioning combines multiple abstraction heuristics while preserving admissibility, but computing optimal partitions online is expensive. We propose a framework that learns to infer admissible cost partitions by leveraging the Lagrangian dual equivalence between cost partitioning and multiplier prediction. Planning states and patterns are encoded as labelled graphs, and an action-centric variant of the Weisfeiler-Leman algorithm extracts structural feature vectors. A deep architecture with axial self-attention and a softmax output layer maps these features to cost weights that satisfy the partition constraints by construction, ensuring admissibility. Experiments demonstrate reduced node expansions compared to suboptimal partitioning baselines while maintaining strict admissibility. To our knowledge, this is the first machine-learned heuristic guaranteed to be admissible.
LGSep 2, 2022
Exploiting Fairness to Enhance Sensitive Attributes ReconstructionJulien Ferry, Ulrich Aïvodji, Sébastien Gambs et al.
In recent years, a growing body of work has emerged on how to learn machine learning models under fairness constraints, often expressed with respect to some sensitive attributes. In this work, we consider the setting in which an adversary has black-box access to a target model and show that information about this model's fairness can be exploited by the adversary to enhance his reconstruction of the sensitive attributes of the training data. More precisely, we propose a generic reconstruction correction method, which takes as input an initial guess made by the adversary and corrects it to comply with some user-defined constraints (such as the fairness information) while minimizing the changes in the adversary's guess. The proposed method is agnostic to the type of target model, the fairness-aware learning method as well as the auxiliary knowledge of the adversary. To assess the applicability of our approach, we have conducted a thorough experimental evaluation on two state-of-the-art fair learning methods, using four different fairness metrics with a wide range of tolerances and with three datasets of diverse sizes and sensitive attributes. The experimental results demonstrate the effectiveness of the proposed approach to improve the reconstruction of the sensitive attributes of the training set.
AIAug 29, 2023
Probabilistic Dataset Reconstruction from Interpretable ModelsJulien Ferry, Ulrich Aïvodji, Sébastien Gambs et al.
Interpretability is often pointed out as a key requirement for trustworthy machine learning. However, learning and releasing models that are inherently interpretable leaks information regarding the underlying training data. As such disclosure may directly conflict with privacy, a precise quantification of the privacy impact of such breach is a fundamental problem. For instance, previous work have shown that the structure of a decision tree can be leveraged to build a probabilistic reconstruction of its training dataset, with the uncertainty of the reconstruction being a relevant metric for the information leak. In this paper, we propose of a novel framework generalizing these probabilistic reconstructions in the sense that it can handle other forms of interpretable models and more generic types of knowledge. In addition, we demonstrate that under realistic assumptions regarding the interpretable models' structure, the uncertainty of the reconstruction can be computed efficiently. Finally, we illustrate the applicability of our approach on both decision trees and rule lists, by comparing the theoretical information leak associated to either exact or heuristic learning algorithms. Our results suggest that optimal interpretable models are often more compact and leak less information regarding their training data than greedily-built ones, for a given accuracy level.
AIMar 21, 2022
Optimizing Binary Decision Diagrams with MaxSAT for classificationHao Hu, Marie-José Huguet, Mohamed Siala
The growing interest in explainable artificial intelligence (XAI) for critical decision making motivates the need for interpretable machine learning (ML) models. In fact, due to their structure (especially with small sizes), these models are inherently understandable by humans. Recently, several exact methods for computing such models are proposed to overcome weaknesses of traditional heuristic methods by providing more compact models or better prediction quality. Despite their compressed representation of Boolean functions, Binary decision diagrams (BDDs) did not gain enough interest as other interpretable ML models. In this paper, we first propose SAT-based models for learning optimal BDDs (in terms of the number of features) that classify all input examples. Then, we lift the encoding to a MaxSAT model to learn optimal BDDs in limited depths, that maximize the number of examples correctly classified. Finally, we tackle the fragmentation problem by introducing a method to merge compatible subtrees for the BDDs found via the MaxSAT model. Our empirical study shows clear benefits of the proposed approach in terms of prediction quality and intrepretability (i.e., lighter size) compared to the state-of-the-art approaches.
LGApr 11, 2023
Learning Optimal Fair Scoring Systems for Multi-Class ClassificationJulien Rouzot, Julien Ferry, Marie-José Huguet
Machine Learning models are increasingly used for decision making, in particular in high-stakes applications such as credit scoring, medicine or recidivism prediction. However, there are growing concerns about these models with respect to their lack of interpretability and the undesirable biases they can generate or reproduce. While the concepts of interpretability and fairness have been extensively studied by the scientific community in recent years, few works have tackled the general multi-class classification problem under fairness constraints, and none of them proposes to generate fair and interpretable models for multi-class classification. In this paper, we use Mixed-Integer Linear Programming (MILP) techniques to produce inherently interpretable scoring systems under sparsity and fairness constraints, for the general multi-class classification setup. Our work generalizes the SLIM (Supersparse Linear Integer Models) framework that was proposed by Rudin and Ustun to learn optimal scoring systems for binary classification. The use of MILP techniques allows for an easy integration of diverse operational constraints (such as, but not restricted to, fairness or sparsity), but also for the building of certifiably optimal models (or sub-optimal models with bounded optimality gap).
LGDec 22, 2023
SoK: Taming the Triangle -- On the Interplays between Fairness, Interpretability and Privacy in Machine LearningJulien Ferry, Ulrich Aïvodji, Sébastien Gambs et al.
Machine learning techniques are increasingly used for high-stakes decision-making, such as college admissions, loan attribution or recidivism prediction. Thus, it is crucial to ensure that the models learnt can be audited or understood by human users, do not create or reproduce discrimination or bias, and do not leak sensitive information regarding their training data. Indeed, interpretability, fairness and privacy are key requirements for the development of responsible machine learning, and all three have been studied extensively during the last decade. However, they were mainly considered in isolation, while in practice they interplay with each other, either positively or negatively. In this Systematization of Knowledge (SoK) paper, we survey the literature on the interactions between these three desiderata. More precisely, for each pairwise interaction, we summarize the identified synergies and tensions. These findings highlight several fundamental theoretical and empirical conflicts, while also demonstrating that jointly considering these different requirements is challenging when one aims at preserving a high level of utility. To solve this issue, we also discuss possible conciliation mechanisms, showing that a careful design can enable to successfully handle these different concerns in practice.
LGMar 18, 2024
Smooth Sensitivity for Learning Differentially-Private yet Accurate Rule ListsTimothée Ly, Julien Ferry, Marie-José Huguet et al.
Differentially-private (DP) mechanisms can be embedded into the design of a machine learning algorithm to protect the resulting model against privacy leakage. However, this often comes with a significant loss of accuracy due to the noise added to enforce DP. In this paper, we aim at improving this trade-off for a popular class of machine learning algorithms leveraging the Gini impurity as an information gain criterion to greedily build interpretable models such as decision trees or rule lists. To this end, we establish the smooth sensitivity of the Gini impurity, which can be used to obtain thorough DP guarantees while adding noise scaled with tighter magnitude. We illustrate the applicability of this mechanism by integrating it within a greedy algorithm producing rule list models, motivated by the fact that such models remain understudied in the DP literature. Our theoretical analysis and experimental results confirm that the DP rule lists models integrating smooth sensitivity have higher accuracy that those using other DP frameworks based on global sensitivity, for identical privacy budgets.
LGSep 9, 2019
Learning Fair Rule ListsUlrich Aïvodji, Julien Ferry, Sébastien Gambs et al.
As the use of black-box models becomes ubiquitous in high stake decision-making systems, demands for fair and interpretable models are increasing. While it has been shown that interpretable models can be as accurate as black-box models in several critical domains, existing fair classification techniques that are interpretable by design often display poor accuracy/fairness tradeoffs in comparison with their non-interpretable counterparts. In this paper, we propose FairCORELS, a fair classification technique interpretable by design, whose objective is to learn fair rule lists. Our solution is a multi-objective variant of CORELS, a branch-and-bound algorithm to learn rule lists, that supports several statistical notions of fairness. Examples of such measures include statistical parity, equal opportunity and equalized odds. The empirical evaluation of FairCORELS on real-world datasets demonstrates that it outperforms state-of-the-art fair classification techniques that are interpretable by design while being competitive with non-interpretable ones.
AIJan 23, 2017
Constraint programming for planning test campaigns of communications satellitesEmmanuel Hébrard, Marie-José Huguet, Daniel Veysseire et al.
The payload of communications satellites must go through a series of tests to assert their ability to survive in space. Each test involves some equipment of the payload to be active, which has an impact on the temperature of the payload. Sequencing these tests in a way that ensures the thermal stability of the payload and minimizes the overall duration of the test campaign is a very important objective for satellite manufacturers. The problem can be decomposed in two sub-problems corresponding to two objectives: First, the number of distinct configurations necessary to run the tests must be minimized. This can be modeled as packing the tests into configurations, and we introduce a set of implied constraints to improve the lower bound of the model. Second, tests must be sequenced so that the number of times an equipment unit has to be switched on or off is minimized. We model this aspect using the constraint Switch, where a buffer with limited capacity represents the currently active equipment units, and we introduce an improvement of the propagation algorithm for this constraint. We then introduce a search strategy in which we sequentially solve the sub-problems (packing and sequencing). Experiments conducted on real and random instances show the respective interest of our contributions.