Nicolas Bousquet

ML
h-index26
8papers
16citations
Novelty54%
AI Score53

8 Papers

53.3DCJun 3
The local complexity of certifying parity

Nicolas Bousquet, Laurent Feuilloley, Jorge Valenzuela et al.

In this paper, we consider the problem of locally certifying that the size of a network is even, or more generally, congruent to some fixed number. The parity property is one of the simplest global properties, and it plays an intriguing role in local certification. On the one hand, it is one of the simplest properties in cycles because it is equivalent to 2-colorability, and hence can be certified with a single bit. On the other hand, in general graphs, no non-trivial lower bound on the size of the certificates is known, and the known upper bound basically consists in certifying the \emph{exact} value of $n$. In addition, the nature of the problem makes all the known lower bound approaches fail. We uncover a surprising landscape for parity across different models and graph structures: * In general graphs equipped with identifiers, when allowing verification radius 2, parity can be certified with a constant number of bits. * But in the model of anonymous graphs and allowing verification radius only 1, parity requires $Ω(\log \log^*n)$ bits. * Finally, in bounded expansion graph classes (such as bounded-degree graphs and planar graphs), the lower bound does not apply: in the same restricted model we can design a constant-size certification. We introduce several new tools that we expect to be useful in other contexts, in particular ways to \emph{encode a parent at each node with a constant number of bits} (via implicit use of the IDs and conflict-free colorings) and a new lower bound technique, with complex topologies and higher-order Ramsey-type arguments.

DMJul 17, 2017
A Vizing-like theorem for union vertex-distinguishing edge coloring

Nicolas Bousquet, Antoine Dailly, Eric Duchene et al.

We introduce a variant of the vertex-distinguishing edge coloring problem, where each edge is assigned a subset of colors. The label of a vertex is the union of the sets of colors on edges incident to it. In this paper we investigate the problem of finding a coloring with the minimum number of colors where every vertex receives a distinct label. Finding such a coloring generalizes several other well-known problems of vertex-distinguishing colorings in graphs.We show that for any graph (without connected component reduced to an edge or a single vertex), the minimum number of colors for which such a coloring exists can only take 3possible values depending on the order of the graph. Moreover, we provide the exact value for paths, cycles and complete binary trees.

36.0MLMay 18
Generalized Functional ANOVA in Closed-Form: A Unified View of Additive Explanations

Baptiste Ferrere, Nicolas Bousquet, Fabrice Gamboa et al.

The functional ANOVA, or Hoeffding decomposition, provides a principled framework for interpretability by decomposing a model prediction into main effects and higher-order interactions. For independent inputs, this classical decomposition is explicit. It is closely connected to SHAP values, generalized additive models, and orthogonal polynomial expansions, and therefore constitutes a fundamental tool for additive explainability. In the more general and realistic dependent setting, however, obtaining a tractable representation and estimating the decomposition from data remain challenging. In this work, we address this problem for continuous inputs. By combining Hilbert space methods with the generalized functional ANOVA, we build an explicit decomposition Riesz Basis allowing to easily compute the decomposition. Our formulation recovers the classical independent case and its associated orthogonal decomposition. Building on this representation, we propose a simple but mighty algorithm to estimate the decomposition from a data sample in a model-agnostic setting and we compare it empirically with several state-of-the-art explanation methods, demonstrating the power of the approach.

MLMar 3
Exact Functional ANOVA Decomposition for Categorical Inputs Models

Baptiste Ferrere, Nicolas Bousquet, Fabrice Gamboa et al.

Functional ANOVA offers a principled framework for interpretability by decomposing a model's prediction into main effects and higher-order interactions. For independent features, this decomposition is well-defined, strongly linked with SHAP values, and serves as a cornerstone of additive explainability. However, the lack of an explicit closed-form expression for general dependent distributions has forced practitioners to rely on costly sampling-based approximations. We completely resolve this limitation for categorical inputs. By bridging functional analysis with the extension of discrete Fourier analysis, we derive a closed-form decomposition without any assumption. Our formulation is computationally very efficient. It seamlessly recovers the classical independent case and extends to arbitrary dependence structures, including distributions with non-rectangular support. Furthermore, leveraging the intrinsic link between SHAP and ANOVA under independence, our framework yields a natural generalization of SHAP values for the general categorical setting.

CVNov 4, 2024
Benchmarking XAI Explanations with Human-Aligned Evaluations

Rémi Kazmierczak, Steve Azzolin, Eloïse Berthier et al.

We introduce PASTA (Perceptual Assessment System for explanaTion of Artificial Intelligence), a novel human-centric framework for evaluating eXplainable AI (XAI) techniques in computer vision. Our first contribution is the creation of the PASTA-dataset, the first large-scale benchmark that spans a diverse set of models and both saliency-based and concept-based explanation methods. This dataset enables robust, comparative analysis of XAI techniques based on human judgment. Our second contribution is an automated, data-driven benchmark that predicts human preferences using the PASTA-dataset. This scoring called PASTA-score method offers scalable, reliable, and consistent evaluation aligned with human perception. Additionally, our benchmark allows for comparisons between explanations across different modalities, an aspect previously unaddressed. We then propose to apply our scoring method to probe the interpretability of existing models and to build more human interpretable XAI methods.

MLDec 9, 2025
WTNN: Weibull-Tailored Neural Networks for survival analysis

Gabrielle Rives, Olivier Lopez, Nicolas Bousquet

The Weibull distribution is a commonly adopted choice for modeling the survival of systems subject to maintenance over time. When only proxy indicators and censored observations are available, it becomes necessary to express the distribution's parameters as functions of time-dependent covariates. Deep neural networks provide the flexibility needed to learn complex relationships between these covariates and operational lifetime, thereby extending the capabilities of traditional regression-based models. Motivated by the analysis of a fleet of military vehicles operating in highly variable and demanding environments, as well as by the limitations observed in existing methodologies, this paper introduces WTNN, a new neural network-based modeling framework specifically designed for Weibull survival studies. The proposed architecture is specifically designed to incorporate qualitative prior knowledge regarding the most influential covariates, in a manner consistent with the shape and structure of the Weibull distribution. Through numerical experiments, we show that this approach can be reliably trained on proxy and right-censored data, and is capable of producing robust and interpretable survival predictions that can improve existing approaches.

MLOct 8, 2025
Multivariate Bernoulli Hoeffding Decomposition: From Theory to Sensitivity Analysis

Baptiste Ferrere, Nicolas Bousquet, Fabrice Gamboa et al.

Understanding the behavior of predictive models with random inputs can be achieved through functional decompositions into sub-models that capture interpretable effects of input groups. Building on recent advances in uncertainty quantification, the existence and uniqueness of a generalized Hoeffding decomposition have been established for correlated input variables, using oblique projections onto suitable functional subspaces. This work focuses on the case of Bernoulli inputs and provides a complete analytical characterization of the decomposition. We show that, in this discrete setting, the associated subspaces are one-dimensional and that the decomposition admits a closed-form representation. One of the main contributions of this study is to generalize the classical Fourier--Walsh--Hadamard decomposition for pseudo-Boolean functions to the correlated case, yielding an oblique version when the underlying distribution is not a product measure, and recovering the standard orthogonal form when independence holds. This explicit structure offers a fully interpretable framework, clarifying the contribution of each input combination and theoretically enabling model reverse engineering. From this formulation, explicit sensitivity measures-such as Sobol' indices and Shapley effects-can be directly derived. Numerical experiments illustrate the practical interest of the approach for decision-support problems involving binary features. The paper concludes with perspectives on extending the methodology to high-dimensional settings and to models involving inputs with finite, non-binary support.

MLDec 6, 2017
An innovative solution for breast cancer textual big data analysis

Nicolas Thiebaut, Antoine Simoulin, Karl Neuberger et al.

The digitalization of stored information in hospitals now allows for the exploitation of medical data in text format, as electronic health records (EHRs), initially gathered for other purposes than epidemiology. Manual search and analysis operations on such data become tedious. In recent years, the use of natural language processing (NLP) tools was highlighted to automatize the extraction of information contained in EHRs, structure it and perform statistical analysis on this structured information. The main difficulties with the existing approaches is the requirement of synonyms or ontology dictionaries, that are mostly available in English only and do not include local or custom notations. In this work, a team composed of oncologists as domain experts and data scientists develop a custom NLP-based system to process and structure textual clinical reports of patients suffering from breast cancer. The tool relies on the combination of standard text mining techniques and an advanced synonym detection method. It allows for a global analysis by retrieval of indicators such as medical history, tumor characteristics, therapeutic responses, recurrences and prognosis. The versatility of the method allows to obtain easily new indicators, thus opening up the way for retrospective studies with a substantial reduction of the amount of manual work. With no need for biomedical annotators or pre-defined ontologies, this language-agnostic method reached an good extraction accuracy for several concepts of interest, according to a comparison with a manually structured file, without requiring any existing corpus with local or new notations.