NAMay 21
Dimension-Free Multimodal Sampling via Preconditioned Annealed Langevin DynamicsLorenzo Baldassari, Josselin Garnier, Knut Solna et al.
Designing sampling algorithms for multimodal targets that remain stable under refinement of the finite-dimensional approximation of an underlying function-space problem is a central challenge. Annealed Langevin dynamics (ALD) is a natural alternative to classical Langevin in this context, since it is often observed to improve exploration across modes. Yet a gap remains between its empirical success and existing theory: under which conditions can ALD be guaranteed to remain stable across dimensions? In this paper, we bridge this gap by providing a uniform-in-dimension analysis of continuous-time ALD for Gaussian-mixture targets. Along an explicit annealing path obtained by gradually removing Gaussian smoothing from the target, we identify spectral conditions linking the smoothing covariance to the component covariances under which ALD achieves a prescribed accuracy in Kullback-Leibler divergence within a dimension-uniform time horizon. We then establish stability in a perturbative regime with imperfect initialization and approximate scores. Under a misspecified-mixture score model, we show that preconditioning ALD with an operator whose spectrum decays sufficiently fast prevents error terms from accumulating across coordinates and thereby preserves dimension-uniform control.
APMar 6, 2013
Modeling active electrolocation in weakly electric fishHabib Ammari, Thomas Boulier, Josselin Garnier
In this paper, we provide a mathematical model for the electrolocation in weakly electric fishes. We first investigate the forward complex conductivity problem and derive the approximate boundary conditions on the skin of the fish. Then we provide a dipole approximation for small targets away from the fish. Based on this approximation, we obtain a non-iterative location search algorithm using multi-frequency measurements. We present numerical experiments to illustrate the performance and the stability of the proposed multi-frequency location search algorithm. Finally, in the case of disk- and ellipse-shaped targets, we provide a method to reconstruct separately the conductivity, the permittivity, and the size of the targets from multi-frequency measurements.
MATH-PHFeb 26, 2013
Shape recognition and classification in electro-sensingHabib Ammari, Thomas Boulier, Josselin Garnier et al.
This paper aims at advancing the field of electro-sensing. It exhibits the physical mechanism underlying shape perception for weakly electric fish. These fish orient themselves at night in complete darkness by employing their active electrolocation system. They generate a stable, high-frequency, weak electric field and perceive the transdermal potential modulations caused by a nearby target with different admittivity than the surrounding water. In this paper, we explain how weakly electric fish might identify and classify a target, knowing by advance that the latter belongs to a certain collection of shapes. Our model of the weakly electric fish relies on differential imaging, i.e., by forming an image from the perturbations of the field due to targets, and physics-based classification. The electric fish would first locate the target using a specific location search algorithm. Then it could extract, from the perturbations of the electric field, generalized (or high-order) polarization tensors of the target. Computing, from the extracted features, invariants under rigid motions and scaling yields shape descriptors. The weakly electric fish might classify a target by comparing its invariants with those of a set of learned shapes. On the other hand, when measurements are taken at multiple frequencies, the fish might exploit the shifts and use the spectral content of the generalized polarization tensors to dramatically improve the stability with respect to measurement noise of the classification procedure in electro-sensing. Surprisingly, it turns out that the first-order polarization tensor at multiple frequencies could be enough for the purpose of classification. A procedure to eliminate the background field in the case where the permittivity of the surrounding medium can be neglected, and hence improve further the stability of the classification process, is also discussed.
OCFeb 26, 2013
Target Identification Using Dictionary Matching of Generalized Polarization TensorsHabib Ammari, Thomas Boulier, Josselin Garnier et al.
The aim of this paper is to provide a fast and efficient procedure for (real-time) target identification in imaging based on matching on a dictionary of precomputed generalized polarization tensors (GPTs). The approach is based on some important properties of the GPTs and new invariants. A new shape representation is given and numerically tested in the presence of measurement noise. The stability and resolution of the proposed identification algorithm is numerically quantified.
NADec 14, 2012
Tracking of a Mobile Target Using Generalized Polarization TensorsHabib Ammari, Thomas Boulier, Josselin Garnier et al.
In this paper we apply an extended Kalman filter to track both the location and the orientation of a mobile target from multistatic response measurements. We also analyze the effect of the limited-view aspect on the stability and the efficiency of our tracking approach. Our algorithm is based on the use of the generalized polarization tensors, which can be reconstructed from the multistatic response measurements by solving a linear system. The system has the remarkable property that low order generalized polarization tensors are not affected by the error caused by the instability of higher orders in the presence of measurement noise.
MLMay 29, 2022
Comparison of meta-learners for estimating multi-valued treatment heterogeneous effectsNaoufal Acharki, Ramiro Lugo, Antoine Bertoncello et al.
Conditional Average Treatment Effects (CATE) estimation is one of the main challenges in causal inference with observational data. In addition to Machine Learning based-models, nonparametric estimators called meta-learners have been developed to estimate the CATE with the main advantage of not restraining the estimation to a specific supervised learning method. This task becomes, however, more complicated when the treatment is not binary as some limitations of the naive extensions emerge. This paper looks into meta-learners for estimating the heterogeneous effects of multi-valued treatments. We consider different meta-learners, and we carry out a theoretical analysis of their error upper bounds as functions of important parameters such as the number of treatment levels, showing that the naive extensions do not always provide satisfactory results. We introduce and discuss meta-learners that perform well as the number of treatments increases. We empirically confirm the strengths and weaknesses of those methods with synthetic and semi-synthetic datasets.
PRJun 6, 2012
Anomalous Shock Displacement Probabilities for a Perturbed Scalar Conservation LawJosselin Garnier, George Papanicolaou, Tzu-Wei Yang
We consider an one-dimensional conservation law with random space-time forcing and calculate using large deviations the exponentially small probabilities of anomalous shock profile displacements. Under suitable hypotheses on the spatial support and structure of random forces, we analyze the scaling behavior of the rate function, which is the exponential decay rate of the displacement probabilities. For small displacements we show that the rate function is bounded above and below by the square of the displacement divided by time. For large displacements the corresponding bounds for the rate function are proportional to the displacement. We calculate numerically the rate function under different conditions and show that the theoretical analysis of scaling behavior is confirmed. We also apply a large-deviation-based importance sampling Monte Carlo strategy to estimate the displacement probabilities. We use a biased distribution centered on the forcing that gives the most probable transition path for the anomalous shock profile, which is the minimizer of the rate function. The numerical simulations indicate that this strategy is much more effective and robust than basic Monte Carlo.
APFeb 22, 2016
Robust imaging with electromagnetic waves in noisy environmentsLiliana Borcea, Josselin Garnier
We study imaging with an array of sensors that probes a medium with single frequency electromagnetic waves and records the scattered electric field. The medium is known and homogenous except for some small and penetrable inclusions. The goal of inversion is to locate and characterize these inclusions from the data collected by the array, which are corrupted by additive noise. We use results from random matrix theory to obtain a robust inversion method. We assess its performance with numerical simulations and quantify the benefit of measuring more than one component of the scattered electric field.
MLMay 15
Dimension-Uniform Discretization Analysis of Preconditioned Annealed Langevin Dynamics for Multimodal Gaussian MixturesLorenzo Baldassari, Josselin Garnier, Knut Solna et al.
Obtaining stable diffusion-based samplers in high- and infinite-dimensional settings is challenging because errors can accumulate across high-frequency coordinates and make the dynamics unstable under refinement of the finite-dimensional approximation of the underlying function-space problem. Discretization is a typical source of such errors, and preconditioning with a suitable spectral decay is one way to control their accumulation. In this paper, we study this problem for preconditioned annealed Langevin dynamics (ALD) applied to Gaussian mixtures. We first show that Euler-Maruyama (EM) discretization, by treating the stiff linear part of the annealed score with a forward Euler step, imposes a stability constraint coupling the preconditioner with the annealed covariance scale. Together with the conditions ensuring dimension-uniform control of the annealed dynamics, this constraint forces the initial smoothed law to remain uniformly close to the target across dimensions. We then consider an exponential-integrator scheme that integrates the stiff linear part of the annealed score exactly. Under explicit spectral summability conditions coupling the smoothing covariance, the component covariance spectra, and the preconditioner, we prove a dimension-uniform Kullback-Leibler (KL) bound for this scheme. This bound can be made arbitrarily small, uniformly in dimension, by allowing enough time for annealing and then refining the time mesh accordingly. Importantly, these conditions allow regimes in which the KL divergence between the target and the initial smoothed law diverges with dimension, showing that the restrictions imposed by EM are scheme-dependent rather than intrinsic to ALD.
MLMay 13
On Hallucinations in Inverse Problems: Fundamental Limits and Provable Assessment MethodsDavid Iagaru, Nina M. Gottschling, Anders C. Hansen et al.
Artificial intelligence (AI) has transformed imaging inverse problems, from medical diagnostics to Earth observation. Yet deep neural networks can produce hallucinations, realistic-looking but incorrect details, undermining their reliability, especially when ground truth data is unavailable. We develop a theoretical framework showing that such hallucinations are not merely artifacts of particular models, but can arise from the ill-posed nature of the inverse problem itself. We derive necessary and sufficient conditions for hallucinations, together with computable bounds on their magnitude that depend only on the forward model. Building on this theory, we introduce algorithms to: (1) estimate the minimum hallucination magnitude achievable by any reconstruction model for a given input; (2) assess the faithfulness of reconstructed details by a given reconstruction model. Experiments across three imaging tasks demonstrate that our approach applies broadly, including to modern generative models, and provides a principled way to quantify and evaluate AI hallucinations.
MLMay 11
Multifidelity Gaussian process regression for solving nonlinear partial differential equationsFatima-Zahrae El-Boukkouri, Josselin Garnier, Olivier Roustant
Solving nonlinear partial differential equations (PDEs) using kernel methods offers a compelling alternative to traditional numerical solvers. However, the performance of these methods strongly depends on the choice of kernel. In this work, as the available information is inherently multifidelity, we propose a kernel learning approach based on cokriging, leveraging empirical information from multifidelity simulations. In the first step, we fit a differentiable non-stationary kernel to an empirical kernel obtained from low-fidelity simulations. In the second step, we derive a high-fidelity kernel with estimated hyperparameters, and construct a corresponding high-fidelity mean using the multifidelity framework. These components can then be used within a Gaussian process framework for solving PDEs. Finally, we demonstrate the performance of the proposed physics-informed method on the Burgers' equation.
MLFeb 6, 2024
Gaussian process regression with Sliced Wasserstein Weisfeiler-Lehman graph kernelsRaphaël Carpintero Perez, Sébastien da Veiga, Josselin Garnier et al.
Supervised learning has recently garnered significant attention in the field of computational physics due to its ability to effectively extract complex patterns for tasks like solving partial differential equations, or predicting material properties. Traditionally, such datasets consist of inputs given as meshes with a large number of nodes representing the problem geometry (seen as graphs), and corresponding outputs obtained with a numerical solver. This means the supervised learning model must be able to handle large and sparse graphs with continuous node attributes. In this work, we focus on Gaussian process regression, for which we introduce the Sliced Wasserstein Weisfeiler-Lehman (SWWL) graph kernel. In contrast to existing graph kernels, the proposed SWWL kernel enjoys positive definiteness and a drastic complexity reduction, which makes it possible to process datasets that were previously impossible to handle. The new kernel is first validated on graph classification for molecular datasets, where the input graphs have a few tens of nodes. The efficiency of the SWWL kernel is then illustrated on graph regression in computational fluid dynamics and solid mechanics, where the input graphs are made up of tens of thousands of nodes.
MLMay 24, 2024
Taming Score-Based Diffusion Priors for Infinite-Dimensional Nonlinear Inverse ProblemsLorenzo Baldassari, Ali Siahkoohi, Josselin Garnier et al.
This work introduces a sampling method capable of solving Bayesian inverse problems in function space. It does not assume the log-concavity of the likelihood, meaning that it is compatible with nonlinear inverse problems. The method leverages the recently defined infinite-dimensional score-based diffusion models as a learning-based prior, while enabling provable posterior sampling through a Langevin-type MCMC algorithm defined on function spaces. A novel convergence analysis is conducted, inspired by the fixed-point methods established for traditional regularization-by-denoising algorithms and compatible with weighted annealing. The obtained convergence bound explicitly depends on the approximation error of the score; a well-approximated score is essential to obtain a well-approximated posterior. Stylized and PDE-based examples are provided, demonstrating the validity of our convergence analysis. We conclude by presenting a discussion of the method's challenges related to learning the score and computational complexity.
MLOct 21, 2024
Learning signals defined on graphs with optimal transport and Gaussian process regressionRaphaël Carpintero Perez, Sébastien da Veiga, Josselin Garnier et al.
In computational physics, machine learning has now emerged as a powerful complementary tool to explore efficiently candidate designs in engineering studies. Outputs in such supervised problems are signals defined on meshes, and a natural question is the extension of general scalar output regression models to such complex outputs. Changes between input geometries in terms of both size and adjacency structure in particular make this transition non-trivial. In this work, we propose an innovative strategy for Gaussian process regression where inputs are large and sparse graphs with continuous node attributes and outputs are signals defined on the nodes of the associated inputs. The methodology relies on the combination of regularized optimal transport, dimension reduction techniques, and the use of Gaussian processes indexed by graphs. In addition to enabling signal prediction, the main point of our proposal is to come with confidence intervals on node values, which is crucial for uncertainty quantification and active learning. Numerical experiments highlight the efficiency of the method to solve real problems in fluid dynamics and solid mechanics.
MLOct 2, 2025
A reproducible comparative study of categorical kernels for Gaussian process regression, with new clustering-based nested kernelsRaphaël Carpintero Perez, Sébastien Da Veiga, Josselin Garnier
Designing categorical kernels is a major challenge for Gaussian process regression with continuous and categorical inputs. Despite previous studies, it is difficult to identify a preferred method, either because the evaluation metrics, the optimization procedure, or the datasets change depending on the study. In particular, reproducible code is rarely available. The aim of this paper is to provide a reproducible comparative study of all existing categorical kernels on many of the test cases investigated so far. We also propose new evaluation metrics inspired by the optimization community, which provide quantitative rankings of the methods across several tasks. From our results on datasets which exhibit a group structure on the levels of categorical inputs, it appears that nested kernels methods clearly outperform all competitors. When the group structure is unknown or when there is no prior knowledge of such a structure, we propose a new clustering-based strategy using target encodings of categorical variables. We show that on a large panel of datasets, which do not necessarily have a known group structure, this estimation strategy still outperforms other approaches while maintaining low computational cost.
MLMay 23, 2025
Preconditioned Langevin Dynamics with Score-Based Generative Models for Infinite-Dimensional Linear Bayesian Inverse ProblemsLorenzo Baldassari, Josselin Garnier, Knut Solna et al.
Designing algorithms for solving high-dimensional Bayesian inverse problems directly in infinite-dimensional function spaces - where such problems are naturally formulated - is crucial to ensure stability and convergence as the discretization of the underlying problem is refined. In this paper, we contribute to this line of work by analyzing a widely used sampler for linear inverse problems: Langevin dynamics driven by score-based generative models (SGMs) acting as priors, formulated directly in function space. Building on the theoretical framework for SGMs in Hilbert spaces, we give a rigorous definition of this sampler in the infinite-dimensional setting and derive, for the first time, error estimates that explicitly depend on the approximation error of the score. As a consequence, we obtain sufficient conditions for global convergence in Kullback-Leibler divergence on the underlying function space. Preventing numerical instabilities requires preconditioning of the Langevin algorithm and we prove the existence and the form of an optimal preconditioner. The preconditioner depends on both the score error and the forward operator and guarantees a uniform convergence rate across all posterior modes. Our analysis applies to both Gaussian and a general class of non-Gaussian priors. Finally, we present examples that illustrate and validate our theoretical findings.
MLMay 28, 2023
Conditional score-based diffusion models for Bayesian inference in infinite dimensionsLorenzo Baldassari, Ali Siahkoohi, Josselin Garnier et al.
Since their initial introduction, score-based diffusion models (SDMs) have been successfully applied to solve a variety of linear inverse problems in finite-dimensional vector spaces due to their ability to efficiently approximate the posterior distribution. However, using SDMs for inverse problems in infinite-dimensional function spaces has only been addressed recently, primarily through methods that learn the unconditional score. While this approach is advantageous for some inverse problems, it is mostly heuristic and involves numerous computationally costly forward operator evaluations during posterior sampling. To address these limitations, we propose a theoretically grounded method for sampling from the posterior of infinite-dimensional Bayesian linear inverse problems based on amortized conditional SDMs. In particular, we prove that one of the most successful approaches for estimating the conditional score in finite dimensions - the conditional denoising estimator - can also be applied in infinite dimensions. A significant part of our analysis is dedicated to demonstrating that extending infinite-dimensional SDMs to the conditional setting requires careful consideration, as the conditional score typically blows up for small times, contrarily to the unconditional score. We conclude by presenting stylized and large-scale numerical examples that validate our approach, offer additional insights, and demonstrate that our method enables large-scale, discretization-invariant Bayesian inference.
MLSep 9, 2021
Importance sampling based active learning for parametric seismic fragility curve estimationClement Gauchy, Cyril Feau, Josselin Garnier
The key elements of seismic probabilistic risk assessment studies are the fragility curves which express the probabilities of failure of structures conditional to a seismic intensity measure. A multitude of procedures is currently available to estimate these curves. For modeling-based approaches which may involve complex and expensive numerical models, the main challenge is to optimize the calls to the numerical codes to reduce the estimation costs. Adaptive techniques can be used for this purpose, but in doing so, taking into account the uncertainties of the estimates (via confidence intervals or ellipsoids related to the size of the samples used) is an arduous task because the samples are no longer independent and possibly not identically distributed. The main contribution of this work is to deal with this question in a mathematical and rigorous way. To this end, we propose and implement an active learning methodology based on adaptive importance sampling for parametric estimations of fragility curves. We prove some theoretical properties (consistency and asymptotic normality) for the estimator of interest. Moreover, we give a convergence criterion in order to use asymptotic confidence ellipsoids. Finally, the performances of the methodology are evaluated on analytical and industrial test cases of increasing complexity.
LGSep 25, 2018
Efficient Seismic fragility curve estimation by Active Learning on Support Vector MachinesRémi Sainct, Cyril Feau, Jean-Marc Martinez et al.
Fragility curves which express the failure probability of a structure, or critical components, as function of a loading intensity measure are nowadays widely used (i) in Seismic Probabilistic Risk Assessment studies, (ii) to evaluate impact of construction details on the structural performance of installations under seismic excitations or under other loading sources such as wind. To avoid the use of parametric models such as lognormal model to estimate fragility curves from a reduced number of numerical calculations, a methodology based on Support Vector Machines coupled with an active learning algorithm is proposed in this paper. In practice, input excitation is reduced to some relevant parameters and, given these parameters, SVMs are used for a binary classification of the structural responses relative to a limit threshold of exceedance. Since the output is not only binary, this is a score, a probabilistic interpretation of the output is exploited to estimate very efficiently fragility curves as score functions or as functions of classical seismic intensity measures.