Marc Schoenauer

LG
h-index34
55papers
1,039citations
Novelty43%
AI Score47

55 Papers

MLSep 11, 2023
Interpretable learning of effective dynamics for multiscale systems

Emmanuel Menier, Sebastian Kaltenbach, Mouadh Yagoubi et al.

The modeling and simulation of high-dimensional multiscale systems is a critical challenge across all areas of science and engineering. It is broadly believed that even with today's computer advances resolving all spatiotemporal scales described by the governing equations remains a remote target. This realization has prompted intense efforts to develop model order reduction techniques. In recent years, techniques based on deep recurrent neural networks have produced promising results for the modeling and simulation of complex spatiotemporal systems and offer large flexibility in model development as they can incorporate experimental and computational data. However, neural networks lack interpretability, which limits their utility and generalizability across complex systems. Here we propose a novel framework of Interpretable Learning Effective Dynamics (iLED) that offers comparable accuracy to state-of-the-art recurrent neural network-based approaches while providing the added benefit of interpretability. The iLED framework is motivated by Mori-Zwanzig and Koopman operator theory, which justifies the choice of the specific architecture. We demonstrate the effectiveness of the proposed framework in simulations of three benchmark multiscale systems. Our results show that the iLED framework can generate accurate predictions and obtain interpretable dynamics, making it a promising approach for solving high-dimensional multiscale systems.

NAMay 31, 2010
Experimental Comparisons of Derivative Free Optimization Algorithms

Anne Auger, Nikolaus Hansen, Jorge M. Perez Zerpa et al.

In this paper, the performances of the quasi-Newton BFGS algorithm, the NEWUOA derivative free optimizer, the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), the Differential Evolution (DE) algorithm and Particle Swarm Optimizers (PSO) are compared experimentally on benchmark functions reflecting important challenges encountered in real-world optimization problems. Dependence of the performances in the conditioning of the problem and rotational invariance of the algorithms are in particular investigated.

AIDec 17, 2012
Online Learning for Ground Trajectory Prediction

Areski Hadjaz, Gaétan Marceau, Pierre Savéant et al.

This paper presents a model based on an hybrid system to numerically simulate the climbing phase of an aircraft. This model is then used within a trajectory prediction tool. Finally, the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) optimization algorithm is used to tune five selected parameters, and thus improve the accuracy of the model. Incorporated within a trajectory prediction tool, this model can be used to derive the order of magnitude of the prediction error over time, and thus the domain of validity of the trajectory prediction. A first validation experiment of the proposed model is based on the errors along time for a one-time trajectory prediction at the take off of the flight with respect to the default values of the theoretical BADA model. This experiment, assuming complete information, also shows the limit of the model. A second experiment part presents an on-line trajectory prediction, in which the prediction is continuously updated based on the current aircraft position. This approach raises several issues, for which improvements of the basic model are proposed, and the resulting trajectory prediction tool shows statistically significantly more accurate results than those of the default model.

NAJun 11, 2007
Evolutionary Mesh Numbering: Preliminary Results

Francis Sourd, Marc Schoenauer

Mesh numbering is a critical issue in Finite Element Methods, as the computational cost of one analysis is highly dependent on the order of the nodes of the mesh. This paper presents some preliminary investigations on the problem of mesh numbering using Evolutionary Algorithms. Three conclusions can be drawn from these experiments. First, the results of the up-to-date method used in all FEM softwares (Gibb's method) can be consistently improved; second, none of the crossover operators tried so far (either general or problem specific) proved useful; third, though the general tendency in Evolutionary Computation seems to be the hybridization with other methods (deterministic or heuristic), none of the presented attempt did encounter any success yet. The good news, however, is that this algorithm allows an improvement over the standard heuristic method between 12% and 20% for both the 1545 and 5453-nodes meshes used as test-bed. Finally, some strange interaction between the selection scheme and the use of problem specific mutation operator was observed, which appeals for further investigation.

OCOct 1, 2007
Identification of the Isotherm Function in Chromatography Using CMA-ES

Mohamed Jebalia, Anne Auger, Marc Schoenauer et al.

This paper deals with the identification of the flux for a system of conservation laws in the specific example of analytic chromatography. The fundamental equations of chromatographic process are highly non linear. The state-of-the-art Evolution Strategy, CMA-ES (the Covariance Matrix Adaptation Evolution Strategy), is used to identify the parameters of the so-called isotherm function. The approach was validated on different configurations of simulated data using either one, two or three components mixtures. CMA-ES is then applied to real data cases and its results are compared to those of a gradient-based strategy.

AIDec 17, 2012
Increasing Air Traffic: What is the Problem?

Areski Hadjaz, Gaétan Marceau, Pierre Savéant et al.

Nowadays, huge efforts are made to modernize the air traffic management systems to cope with uncertainty, complexity and sub-optimality. An answer is to enhance the information sharing between the stakeholders. This paper introduces a framework that bridges the gap between air traffic management and air traffic control on the one hand, and bridges the gap between the ground, the approach and the en-route centers on the other hand. An original system is presented, that has three essential components: the trajectory models, the optimization process, and the monitoring process. The uncertainty of the trajectory is modeled with a Bayesian Network, where the nodes are associated to two types of random variables: the time of overflight on metering points of the airspace, and the traveling time of the routes linking these points. The resulting Bayesian Network covers the complete airspace, and Monte- Carlo simulations are done to estimate the probabilities of sector congestion and delays. On top of this trajectory model, an optimization process minimizes these probabilities by tuning the parameters of the Bayesian trajectory model related to overflight times on metering points. The last component is the monitoring process, that continuously updates the situation of the airspace, modifying the trajectories uncertainties according to actual positions of aircraft. After each update, a new optimal set of overflight times is computed, and can be communicated to the controllers as clearances for the aircraft pilots. The paper presents a formal specification of this global optimization problem, whose underlying rationale was derived with the help of air traffic controllers at Thales Air Systems.

AIApr 28, 2023
MultiZenoTravel: a Tunable Benchmark for Multi-Objective Planning with Known Pareto Front

Alexandre Quemy, Marc Schoenauer, Johann Dreo

Multi-objective AI planning suffers from a lack of benchmarks exhibiting known Pareto Fronts. In this work, we propose a tunable benchmark generator, together with a dedicated solver that provably computes the true Pareto front of the resulting instances. First, we prove a proposition allowing us to characterize the optimal plans for a constrained version of the problem, and then show how to reduce the general problem to the constrained one. Second, we provide a constructive way to find all the Pareto-optimal plans and discuss the complexity of the algorithm. We provide an implementation that allows the solver to handle realistic instances in a reasonable time. Finally, as a practical demonstration, we used this solver to find all Pareto-optimal plans between the two largest airports in the world, considering the routes between the 50 largest airports, spherical distances between airports and a made-up risk.

LGFeb 6, 2023
An Implicit GNN Solver for Poisson-like problems

Matthieu Nastorg, Michele Alessandro Bucci, Thibault Faney et al.

This paper presents $Ψ$-GNN, a novel Graph Neural Network (GNN) approach for solving the ubiquitous Poisson PDE problems with mixed boundary conditions. By leveraging the Implicit Layer Theory, $Ψ$-GNN models an "infinitely" deep network, thus avoiding the empirical tuning of the number of required Message Passing layers to attain the solution. Its original architecture explicitly takes into account the boundary conditions, a critical prerequisite for physical applications, and is able to adapt to any initially provided solution. $Ψ$-GNN is trained using a "physics-informed" loss, and the training process is stable by design, and insensitive to its initialization. Furthermore, the consistency of the approach is theoretically proven, and its flexibility and generalization efficiency are experimentally demonstrated: the same learned model can accurately handle unstructured meshes of various sizes, as well as different boundary conditions. To the best of our knowledge, $Ψ$-GNN is the first physics-informed GNN-based method that can handle various unstructured domains, boundary conditions and initial solutions while also providing convergence guarantees.

LGNov 21, 2022
DS-GPS : A Deep Statistical Graph Poisson Solver (for faster CFD simulations)

Matthieu Nastorg, Marc Schoenauer, Guillaume Charpiat et al.

This paper proposes a novel Machine Learning-based approach to solve a Poisson problem with mixed boundary conditions. Leveraging Graph Neural Networks, we develop a model able to process unstructured grids with the advantage of enforcing boundary conditions by design. By directly minimizing the residual of the Poisson equation, the model attempts to learn the physics of the problem without the need for exact solutions, in contrast to most previous data-driven processes where the distance with the available solutions is minimized.

CVJul 8, 2022
Continuous Methods : Hamiltonian Domain Translation

Emmanuel Menier, Michele Alessandro Bucci, Mouadh Yagoubi et al.

This paper proposes a novel approach to domain translation. Leveraging established parallels between generative models and dynamical systems, we propose a reformulation of the Cycle-GAN architecture. By embedding our model with a Hamiltonian structure, we obtain a continuous, expressive and most importantly invertible generative model for domain translation.

LGNov 30, 2022
Continuous Methods : Adaptively intrusive reduced order model closure

Emmanuel Menier, Michele Alessandro Bucci, Mouadh Yagoubi et al.

Reduced order modeling methods are often used as a mean to reduce simulation costs in industrial applications. Despite their computational advantages, reduced order models (ROMs) often fail to accurately reproduce complex dynamics encountered in real life applications. To address this challenge, we leverage NeuralODEs to propose a novel ROM correction approach based on a time-continuous memory formulation. Finally, experimental results show that our proposed method provides a high level of accuracy while retaining the low computational costs inherent to reduced models.

LGJun 18, 2023
Meta-Learning for Airflow Simulations with Graph Neural Networks

Wenzhuo Liu, Mouadh Yagoubi, Marc Schoenauer

The field of numerical simulation is of significant importance for the design and management of real-world systems, with partial differential equations (PDEs) being a commonly used mathematical modeling tool. However, solving PDEs remains still a challenge, as commonly used traditional numerical solvers often require high computational costs. As a result, data-driven methods leveraging machine learning (more particularly Deep Learning) algorithms have been increasingly proposed to learn models that can predict solutions to complex PDEs, such as those arising in computational fluid dynamics (CFD). However, these methods are known to suffer from poor generalization performance on out-of-distribution (OoD) samples, highlighting the need for more efficient approaches. To this end, we present a meta-learning approach to enhance the performance of learned models on OoD samples. Specifically, we set the airflow simulation in CFD over various airfoils as a meta-learning problem, where each set of examples defined on a single airfoil shape is treated as a separate task. Through the use of model-agnostic meta-learning (MAML), we learn a meta-learner capable of adapting to new tasks, i.e., previously unseen airfoil shapes, using only a small amount of task-specific data. We experimentally demonstrate the efficiency of the proposed approach for improving the OoD generalization performance of learned models while maintaining efficiency.

MLMay 8, 2024
Learning Structural Causal Models through Deep Generative Models: Methods, Guarantees, and Challenges

Audrey Poinsot, Alessandro Leite, Nicolas Chesneau et al.

This paper provides a comprehensive review of deep structural causal models (DSCMs), particularly focusing on their ability to answer counterfactual queries using observational data within known causal structures. It delves into the characteristics of DSCMs by analyzing the hypotheses, guarantees, and applications inherent to the underlying deep learning components and structural causal models, fostering a finer understanding of their capabilities and limitations in addressing different counterfactual queries. Furthermore, it highlights the challenges and open questions in the field of deep structural causal modeling. It sets the stages for researchers to identify future work directions and for practitioners to get an overview in order to find out the most appropriate methods for their needs.

CVNov 27, 2024
Mixture of Experts in Image Classification: What's the Sweet Spot?

Mathurin Videau, Alessandro Leite, Marc Schoenauer et al.

Mixture-of-Experts (MoE) models have shown promising potential for parameter-efficient scaling across domains. However, their application to image classification remains limited, often requiring billion-scale datasets to be competitive. In this work, we explore the integration of MoE layers into image classification architectures using open datasets. We conduct a systematic analysis across different MoE configurations and model scales. We find that moderate parameter activation per sample provides the best trade-off between performance and efficiency. However, as the number of activated parameters increases, the benefits of MoE diminish. Our analysis yields several practical insights for vision MoE design. First, MoE layers most effectively strengthen tiny and mid-sized models, while gains taper off for large-capacity networks and do not redefine state-of-the-art ImageNet performance. Second, a Last-2 placement heuristic offers the most robust cross-architecture choice, with Every-2 slightly better for Vision Transform (ViT), and both remaining effective as data and model scale increase. Third, larger datasets (e.g., ImageNet-21k) allow more experts, up to 16, for ConvNeXt to be utilized effectively without changing placement, as increased data reduces overfitting and promotes broader expert specialization. Finally, a simple linear router performs best, suggesting that additional routing complexity yields no consistent benefit.

CLJun 17, 2025
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets

Mathurin Videau, Badr Youbi Idrissi, Alessandro Leite et al. · meta-ai

Tokenization imposes a fixed granularity on the input text, freezing how a language model operates on data and how far in the future it predicts. Byte Pair Encoding (BPE) and similar schemes split text once, build a static vocabulary, and leave the model stuck with that choice. We relax this rigidity by introducing an autoregressive U-Net that learns to embed its own tokens as it trains. The network reads raw bytes, pools them into words, then pairs of words, then up to 4 words, giving it a multi-scale view of the sequence. At deeper stages, the model must predict further into the future -- anticipating the next few words rather than the next byte -- so deeper stages focus on broader semantic patterns while earlier stages handle fine details. When carefully tuning and controlling pretraining compute, shallow hierarchies tie strong BPE baselines, and deeper hierarchies have a promising trend. Because tokenization now lives inside the model, the same system can handle character-level tasks and carry knowledge across low-resource languages.

LGMar 3, 2024
ML4PhySim : Machine Learning for Physical Simulations Challenge (The airfoil design)

Mouadh Yagoubi, Milad Leyli-Abadi, David Danan et al.

The use of machine learning (ML) techniques to solve complex physical problems has been considered recently as a promising approach. However, the evaluation of such learned physical models remains an important issue for industrial use. The aim of this competition is to encourage the development of new ML techniques to solve physical problems using a unified evaluation framework proposed recently, called Learning Industrial Physical Simulations (LIPS). We propose learning a task representing a well-known physical use case: the airfoil design simulation, using a dataset called AirfRANS. The global score calculated for each submitted solution is based on three main categories of criteria covering different aspects, namely: ML-related, Out-Of-Distribution, and physical compliance criteria. To the best of our knowledge, this is the first competition addressing the use of ML-based surrogate approaches to improve the trade-off computational cost/accuracy of physical simulation.The competition is hosted by the Codabench platform with online training and evaluation of all submitted solutions.

CLDec 5, 2024
Evolutionary Pre-Prompt Optimization for Mathematical Reasoning

Mathurin Videau, Alessandro Leite, Marc Schoenauer et al.

Recent advancements have highlighted that large language models (LLMs), when given a small set of task-specific examples, demonstrate remarkable proficiency, a capability that extends to complex reasoning tasks. In particular, the combination of few-shot learning with the chain-of-thought (CoT) approach has been pivotal in steering models towards more logically consistent conclusions. This paper explores the optimization of example selection for designing effective CoT pre-prompts and shows that the choice of the optimization algorithm, typically in favor of comparison-based methods such as evolutionary computation, significantly enhances efficacy and feasibility. Specifically, thanks to a limited exploitative and overfitted optimization, Evolutionary Pre-Prompt Optimization (EPPO) brings an improvement over the naive few-shot approach exceeding 10 absolute points in exact match scores on benchmark datasets such as GSM8k and MathQA. These gains are consistent across various contexts and are further amplified when integrated with self-consistency (SC)

LGOct 15, 2024
Evolutionary Retrofitting

Mathurin Videau, Mariia Zameshina, Alessandro Leite et al.

AfterLearnER (After Learning Evolutionary Retrofitting) consists in applying evolutionary optimization to refine fully trained machine learning models by optimizing a set of carefully chosen parameters or hyperparameters of the model, with respect to some actual, exact, and hence possibly non-differentiable error signal, performed on a subset of the standard validation set. The efficiency of AfterLearnER is demonstrated by tackling non-differentiable signals such as threshold-based criteria in depth sensing, the word error rate in speech re-synthesis, the number of kills per life at Doom, computational accuracy or BLEU in code translation, image quality in 3D generative adversarial networks (GANs), and user feedback in image generation via Latent Diffusion Models (LDM). This retrofitting can be done after training, or dynamically at inference time by taking into account the user feedback. The advantages of AfterLearnER are its versatility, the possibility to use non-differentiable feedback, including human evaluations (i.e., no gradient is needed), the limited overfitting supported by a theoretical study, and its anytime behavior. Last but not least, AfterLearnER requires only a small amount of feedback, i.e., a few dozen to a few hundred scalars, compared to the tens of thousands needed in most related published works.

LGFeb 13, 2024
Multi-Level GNN Preconditioner for Solving Large Scale Problems

Matthieu Nastorg, Jean-Marc Gratien, Thibault Faney et al.

Large-scale numerical simulations often come at the expense of daunting computations. High-Performance Computing has enhanced the process, but adapting legacy codes to leverage parallel GPU computations remains challenging. Meanwhile, Machine Learning models can harness GPU computations effectively but often struggle with generalization and accuracy. Graph Neural Networks (GNNs), in particular, are great for learning from unstructured data like meshes but are often limited to small-scale problems. Moreover, the capabilities of the trained model usually restrict the accuracy of the data-driven solution. To benefit from both worlds, this paper introduces a novel preconditioner integrating a GNN model within a multi-level Domain Decomposition framework. The proposed GNN-based preconditioner is used to enhance the efficiency of a Krylov method, resulting in a hybrid solver that can converge with any desired level of accuracy. The efficiency of the Krylov method greatly benefits from the GNN preconditioner, which is adaptable to meshes of any size and shape, is executed on GPUs, and features a multi-level approach to enforce the scalability of the entire process. Several experiments are conducted to validate the numerical behavior of the hybrid solver, and an in-depth analysis of its performance is proposed to assess its competitiveness against a C++ legacy solver.

LGNov 28, 2025
CausalProfiler: Generating Synthetic Benchmarks for Rigorous and Transparent Evaluation of Causal Machine Learning

Panayiotis Panayiotou, Audrey Poinsot, Alessandro Leite et al.

Causal machine learning (Causal ML) aims to answer "what if" questions using machine learning algorithms, making it a promising tool for high-stakes decision-making. Yet, empirical evaluation practices in Causal ML remain limited. Existing benchmarks often rely on a handful of hand-crafted or semi-synthetic datasets, leading to brittle, non-generalizable conclusions. To bridge this gap, we introduce CausalProfiler, a synthetic benchmark generator for Causal ML methods. Based on a set of explicit design choices about the class of causal models, queries, and data considered, the CausalProfiler randomly samples causal models, data, queries, and ground truths constituting the synthetic causal benchmarks. In this way, Causal ML methods can be rigorously and transparently evaluated under a variety of conditions. This work offers the first random generator of synthetic causal benchmarks with coverage guarantees and transparent assumptions operating on the three levels of causal reasoning: observation, intervention, and counterfactual. We demonstrate its utility by evaluating several state-of-the-art methods under diverse conditions and assumptions, both in and out of the identification regime, illustrating the types of analyses and insights the CausalProfiler enables.

LGAug 12, 2025
Position: Causal Machine Learning Requires Rigorous Synthetic Experiments for Broader Adoption

Audrey Poinsot, Panayiotis Panayiotou, Alessandro Leite et al.

Causal machine learning has the potential to revolutionize decision-making by combining the predictive power of machine learning algorithms with the theory of causal inference. However, these methods remain underutilized by the broader machine learning community, in part because current empirical evaluations do not permit assessment of their reliability and robustness, undermining their practical utility. Specifically, one of the principal criticisms made by the community is the extensive use of synthetic experiments. We argue, on the contrary, that synthetic experiments are essential and necessary to precisely assess and understand the capabilities of causal machine learning methods. To substantiate our position, we critically review the current evaluation practices, spotlight their shortcomings, and propose a set of principles for conducting rigorous empirical analyses with synthetic data. Adopting the proposed principles will enable comprehensive evaluations that build trust in causal machine learning methods, driving their broader adoption and impactful real-world use.

LGJul 2, 2025
Tuning without Peeking: Provable Privacy and Generalization Bounds for LLM Post-Training

Ismail Labiad, Mathurin Videau, Matthieu Kowalski et al.

Gradient-based optimization is the workhorse of deep learning, offering efficient and scalable training via backpropagation. However, exposing gradients during training can leak sensitive information about the underlying data, raising privacy and security concerns such as susceptibility to data poisoning attacks. In contrast, black box optimization methods, which treat the model as an opaque function, relying solely on function evaluations to guide optimization, offer a promising alternative in scenarios where data access is restricted, adversarial risks are high, or overfitting is a concern. This paper introduces BBoxER, an evolutionary black-box method for LLM post-training that induces an information bottleneck via implicit compression of the training data. Leveraging the tractability of information flow, we provide non-vacuous generalization bounds and strong theoretical guarantees for differential privacy, robustness to data poisoning attacks, and extraction attacks. In experiments with LLMs, we demonstrate empirically that black-box optimization methods-despite the scalability and computational challenges inherent to black-box approaches-are able to learn, showing how a few iterations of BBoxER improve performance, generalize well on a benchmark of reasoning datasets, and are robust to membership inference attacks. This positions BBoxER as an attractive add-on on top of gradient-based optimization, offering suitability for deployment in restricted or privacy-sensitive environments while also providing non-vacuous generalization guarantees.

FLU-DYNJun 30, 2024
NeurIPS 2024 ML4CFD Competition: Harnessing Machine Learning for Computational Fluid Dynamics in Airfoil Design

Mouadh Yagoubi, David Danan, Milad Leyli-abadi et al.

The integration of machine learning (ML) techniques for addressing intricate physics problems is increasingly recognized as a promising avenue for expediting simulations. However, assessing ML-derived physical models poses a significant challenge for their adoption within industrial contexts. This competition is designed to promote the development of innovative ML approaches for tackling physical challenges, leveraging our recently introduced unified evaluation framework known as Learning Industrial Physical Simulations (LIPS). Building upon the preliminary edition held from November 2023 to March 2024, this iteration centers on a task fundamental to a well-established physical application: airfoil design simulation, utilizing our proposed AirfRANS dataset. The competition evaluates solutions based on various criteria encompassing ML accuracy, computational efficiency, Out-Of-Distribution performance, and adherence to physical principles. Notably, this competition represents a pioneering effort in exploring ML-driven surrogate methods aimed at optimizing the trade-off between computational efficiency and accuracy in physical simulations. Hosted on the Codabench platform, the competition offers online training and evaluation for all participating solutions.

FLU-DYNFeb 22, 2022
CD-ROM: Complemented Deep-Reduced Order Model

Emmanuel Menier, Michele Alessandro Bucci, Mouadh Yagoubi et al.

Model order reduction through the POD-Galerkin method can lead to dramatic gains in terms of computational efficiency in solving physical problems. However, the applicability of the method to non linear high-dimensional dynamical systems such as the Navier-Stokes equations has been shown to be limited, producing inaccurate and sometimes unstable models. This paper proposes a deep learning based closure modeling approach for classical POD-Galerkin reduced order models (ROM). The proposed approach is theoretically grounded, using neural networks to approximate well studied operators. In contrast with most previous works, the present CD-ROM approach is based on an interpretable continuous memory formulation, derived from simple hypotheses on the behavior of partially observed dynamical systems. The final corrected models can hence be simulated using most classical time stepping schemes. The capabilities of the CD-ROM approach are demonstrated on two classical examples from Computational Fluid Dynamics, as well as a parametric case, the Kuramoto-Sivashinsky equation.

LGNov 5, 2021
Frugal Machine Learning

Mikhail Evchenko, Joaquin Vanschoren, Holger H. Hoos et al.

Machine learning, already at the core of increasingly many systems and applications, is set to become even more ubiquitous with the rapid rise of wearable devices and the Internet of Things. In most machine learning applications, the main focus is on the quality of the results achieved (e.g., prediction accuracy), and hence vast amounts of data are being collected, requiring significant computational resources to build models. In many scenarios, however, it is infeasible or impractical to set up large centralized data repositories. In personal health, for instance, privacy issues may inhibit the sharing of detailed personal data. In such cases, machine learning should ideally be performed on wearable devices themselves, which raises major computational limitations such as the battery capacity of smartwatches. This paper thus investigates frugal learning, aimed to build the most accurate possible models using the least amount of resources. A wide range of learning algorithms is examined through a frugal lens, analyzing their accuracy/runtime performance on a wide range of data sets. The most promising algorithms are thereafter assessed in a real-world scenario by implementing them in a smartwatch and letting them learn activity recognition models on the watch itself.

AIMay 17, 2021
DISCO Verification: Division of Input Space into COnvex polytopes for neural network verification

Julien Girard-Satabin, Aymeric Varasse, Marc Schoenauer et al.

The impressive results of modern neural networks partly come from their non linear behaviour. Unfortunately, this property makes it very difficult to apply formal verification tools, even if we restrict ourselves to networks with a piecewise linear structure. However, such networks yields subregions that are linear and thus simpler to analyse independently. In this paper, we propose a method to simplify the verification problem by operating a partitionning into multiple linear subproblems. To evaluate the feasibility of such an approach, we perform an empirical analysis of neural networks to estimate the number of linear regions, and compare them to the bounds currently known. We also present the impact of a technique aiming at reducing the number of linear regions during training.

NEMay 2, 2021
Paradiseo: From a Modular Framework for Evolutionary Computation to the Automated Design of Metaheuristics ---22 Years of Paradiseo---

Johann Dreo, Arnaud Liefooghe, Sébastien Verel et al.

The success of metaheuristic optimization methods has led to the development of a large variety of algorithm paradigms. However, no algorithm clearly dominates all its competitors on all problems. Instead, the underlying variety of landscapes of optimization problems calls for a variety of algorithms to solve them efficiently. It is thus of prior importance to have access to mature and flexible software frameworks which allow for an efficient exploration of the algorithm design space. Such frameworks should be flexible enough to accommodate any kind of metaheuristics, and open enough to connect with higher-level optimization, monitoring and evaluation softwares. This article summarizes the features of the ParadisEO framework, a comprehensive C++ free software which targets the development of modular metaheuristics. ParadisEO provides a highly modular architecture, a large set of components, speed of execution and automated algorithm design features, which are key to modern approaches to metaheuristics development.

MLFeb 26, 2021
Zoetrope Genetic Programming for Regression

Aurélie Boisbunon, Carlo Fanara, Ingrid Grenet et al.

The Zoetrope Genetic Programming (ZGP) algorithm is based on an original representation for mathematical expressions, targeting evolutionary symbolic regression.The zoetropic representation uses repeated fusion operations between partial expressions, starting from the terminal set. Repeated fusions within an individual gradually generate more complex expressions, ending up in what can be viewed as new features. These features are then linearly combined to best fit the training data. ZGP individuals then undergo specific crossover and mutation operators, and selection takes place between parents and offspring. ZGP is validated using a large number of public domain regression datasets, and compared to other symbolic regression algorithms, as well as to traditional machine learning algorithms. ZGP reaches state-of-the-art performance with respect to both types of algorithms, and demonstrates a low computational time compared to other symbolic regression approaches.

LGNov 25, 2019
CAMUS: A Framework to Build Formal Specifications for Deep Perception Systems Using Simulators

Julien Girard-Satabin, Guillaume Charpiat, Zakaria Chihani et al.

The topic of provable deep neural network robustness has raised considerable interest in recent years. Most research has focused on adversarial robustness, which studies the robustness of perceptive models in the neighbourhood of particular samples. However, other works have proved global properties of smaller neural networks. Yet, formally verifying perception remains uncharted. This is due notably to the lack of relevant properties to verify, as the distribution of possible inputs cannot be formally specified. We propose to take advantage of the simulators often used either to train machine learning models or to check them with statistical tests, a growing trend in industry. Our formulation allows us to formally express and verify safety properties on perception units, covering all cases that could ever be generated by the simulator, to the difference of statistical tests which cover only seen examples. Along with this theoretical formulation , we provide a tool to translate deep learning models into standard logical formulae. As a proof of concept, we train a toy example mimicking an autonomous car perceptive unit, and we formally verify that it will never fail to capture the relevant information in the provided inputs.

SPAug 22, 2019
LEAP nets for power grid perturbations

Benjamin Donnot, Balthazar Donon, Isabelle Guyon et al.

We propose a novel neural network embedding approach to model power transmission grids, in which high voltage lines are disconnected and reconnected with one-another from time to time, either accidentally or willfully. We call our architeture LEAP net, for Latent Encoding of Atypical Perturbation. Our method implements a form of transfer learning, permitting to train on a few source domains, then generalize to new target domains, without learning on any example of that domain. We evaluate the viability of this technique to rapidly assess cu-rative actions that human operators take in emergency situations, using real historical data, from the French high voltage power grid.

LGJun 1, 2019
Automated Machine Learning with Monte-Carlo Tree Search

Herilalaina Rakotoarison, Marc Schoenauer, Michèle Sebag

The AutoML task consists of selecting the proper algorithm in a machine learning portfolio, and its hyperparameter values, in order to deliver the best performance on the dataset at hand. Mosaic, a Monte-Carlo tree search (MCTS) based approach, is presented to handle the AutoML hybrid structural and parametric expensive black-box optimization problem. Extensive empirical studies are conducted to independently assess and compare: i) the optimization processes based on Bayesian optimization or MCTS; ii) its warm-start initialization; iii) the ensembling of the solutions gathered along the search. Mosaic is assessed on the OpenML 100 benchmark and the Scikit-learn portfolio, with statistically significant gains over Auto-Sklearn, winner of former international AutoML challenges.

MLMar 21, 2019
Multi-Domain Adversarial Learning

Alice Schoenauer-Sebag, Louise Heinrich, Marc Schoenauer et al.

Multi-domain learning (MDL) aims at obtaining a model with minimal average risk across multiple domains. Our empirical motivation is automated microscopy data, where cultured cells are imaged after being exposed to known and unknown chemical perturbations, and each dataset displays significant experimental bias. This paper presents a multi-domain adversarial learning approach, MuLANN, to leverage multiple datasets with overlapping but distinct class sets, in a semi-supervised setting. Our contributions include: i) a bound on the average- and worst-domain risk in MDL, obtained using the H-divergence; ii) a new loss to accommodate semi-supervised multi-domain learning and domain adaptation; iii) the experimental validation of the approach, improving on the state of the art on two standard image benchmarks, and a novel bioimage dataset, Cell.

NEFeb 27, 2019
On the Behaviour of Differential Evolution for Problems with Dynamic Linear Constraints

Maryam Hasani-Shoreh, María-Yaneli Ameca-Alducin, Wilson Blaikie et al.

Evolutionary algorithms have been widely applied for solving dynamic constrained optimization problems (DCOPs) as a common area of research in evolutionary optimization. Current benchmarks proposed for testing these problems in the continuous spaces are either not scalable in problem dimension or the settings for the environmental changes are not flexible. Moreover, they mainly focus on non-linear environmental changes on the objective function. While the dynamism in some real-world problems exists in the constraints and can be emulated with linear constraint changes. The purpose of this paper is to introduce a framework which produces benchmarks in which a dynamic environment is created with simple changes in linear constraints (rotation and translation of constraint's hyperplane). Our proposed framework creates dynamic benchmarks that are flexible in terms of number of changes, dimension of the problem and can be applied to test any objective function. Different constraint handling techniques will then be used to compare with our benchmark. The results reveal that with these changes set, there was an observable effect on the performance of the constraint handling techniques.

MLMay 3, 2018
Optimization of computational budget for power system risk assessment

Benjamin Donnot, Isabelle Guyon, Antoine Marot et al.

We address the problem of maintaining high voltage power transmission networks in security at all time, namely anticipating exceeding of thermal limit for eventual single line disconnection (whatever its cause may be) by running slow, but accurate, physical grid simulators. New conceptual frameworks are calling for a probabilistic risk-based security criterion. However, these approaches suffer from high requirements in terms of tractability. Here, we propose a new method to assess the risk. This method uses both machine learning techniques (artificial neural networks) and more standard simulators based on physical laws. More specifically we train neural networks to estimate the overall dangerousness of a grid state. A classical benchmark problem (manpower 118 buses test case) is used to show the strengths of the proposed method.

SOC-PHMay 3, 2018
Anticipating contingengies in power grids using fast neural net screening

Benjamin Donnot, Isabelle Guyon, Marc Schoenauer et al.

We address the problem of maintaining high voltage power transmission networks in security at all time. This requires that power flowing through all lines remain below a certain nominal thermal limit above which lines might melt, break or cause other damages. Current practices include enforcing the deterministic "N-1" reliability criterion, namely anticipating exceeding of thermal limit for any eventual single line disconnection (whatever its cause may be) by running a slow, but accurate, physical grid simulator. New conceptual frameworks are calling for a probabilistic risk based security criterion and are in need of new methods to assess the risk. To tackle this difficult assessment, we address in this paper the problem of rapidly ranking higher order contingencies including all pairs of line disconnections, to better prioritize simulations. We present a novel method based on neural networks, which ranks "N-1" and "N-2" contingencies in decreasing order of presumed severity. We demonstrate on a classical benchmark problem that the residual risk of contingencies decreases dramatically compared to considering solely all "N-1" cases, at no additional computational cost. We evaluate that our method scales up to power grids of the size of the French high voltage power grid (over 1000 power lines).

NEMar 9, 2018
The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities

Joel Lehman, Jeff Clune, Dusan Misevic et al.

Biological evolution provides a creative fount of complex and subtle adaptations, often surprising the scientists who discover them. However, because evolution is an algorithmic process that transcends the substrate in which it occurs, evolution's creativity is not limited to nature. Indeed, many researchers in the field of digital evolution have observed their evolving algorithms and organisms subverting their intentions, exposing unrecognized bugs in their code, producing unexpected adaptations, or exhibiting outcomes uncannily convergent with ones in nature. Such stories routinely reveal creativity by evolution in these digital worlds, but they rarely fit into the standard scientific narrative. Instead they are often treated as mere obstacles to be overcome, rather than results that warrant study in their own right. The stories themselves are traded among researchers through oral tradition, but that mode of information transmission is inefficient and prone to error and outright loss. Moreover, the fact that these stories tend to be shared only among practitioners means that many natural scientists do not realize how interesting and lifelike digital organisms are and how natural their evolution can be. To our knowledge, no collection of such anecdotes has been published before. This paper is the crowd-sourced product of researchers in the fields of artificial life and evolutionary computation who have provided first-hand accounts of such cases. It thus serves as a written, fact-checked collection of scientifically important and even entertaining stories. In doing so we also present here substantial evidence that the existence and importance of evolutionary surprises extends beyond the natural world, and may indeed be a universal property of all complex evolving systems.

MLJan 30, 2018
Fast Power system security analysis with Guided Dropout

Benjamin Donnot, Isabelle Guyon, Marc Schoenauer et al.

We propose a new method to efficiently compute load-flows (the steady-state of the power-grid for given productions, consumptions and grid topology), substituting conventional simulators based on differential equation solvers. We use a deep feed-forward neural network trained with load-flows precomputed by simulation. Our architecture permits to train a network on so-called "n-1" problems, in which load flows are evaluated for every possible line disconnection, then generalize to "n-2" problems without retraining (a clear advantage because of the combinatorial nature of the problem). To that end, we developed a technique bearing similarity with "dropout", which we named "guided dropout".

MLSep 27, 2017
Introducing machine learning for power system operation support

Benjamin Donnot, Isabelle Guyon, Marc Schoenauer et al.

We address the problem of assisting human dispatchers in operating power grids in today's changing context using machine learning, with theaim of increasing security and reducing costs. Power networks are highly regulated systems, which at all times must meet varying demands of electricity with a complex production system, including conventional power plants, less predictable renewable energies (such as wind or solar power), and the possibility of buying/selling electricity on the international market with more and more actors involved at a Europeanscale. This problem is becoming ever more challenging in an aging network infrastructure. One of the primary goals of dispatchers is to protect equipment (e.g. avoid that transmission lines overheat) with few degrees of freedom: we are considering in this paper solely modifications in network topology, i.e. re-configuring the way in which lines, transformers, productions and loads are connected in sub-stations. Using years of historical data collected by the French Transmission Service Operator (TSO) "Réseau de Transport d'Electricité" (RTE), we develop novel machine learning techniques (drawing on "deep learning") to mimic human decisions to devise "remedial actions" to prevent any line to violate power flow limits (so-called "thermal limits"). The proposed technique is hybrid. It does not rely purely on machine learning: every action will be tested with actual simulators before being proposed to the dispatchers or implemented on the grid.

MLSep 5, 2017
Stochastic Gradient Descent: Going As Fast As Possible But Not Faster

Alice Schoenauer-Sebag, Marc Schoenauer, Michèle Sebag

When applied to training deep neural networks, stochastic gradient descent (SGD) often incurs steady progression phases, interrupted by catastrophic episodes in which loss and gradient norm explode. A possible mitigation of such events is to slow down the learning process. This paper presents a novel approach to control the SGD learning rate, that uses two statistical tests. The first one, aimed at fast learning, compares the momentum of the normalized gradient vectors to that of random unit vectors and accordingly gracefully increases or decreases the learning rate. The second one is a change point detection test, aimed at the detection of catastrophic learning episodes; upon its triggering the learning rate is instantly halved. Both abilities of speeding up and slowing down the learning rate allows the proposed approach, called SALeRA, to learn as fast as possible but not faster. Experiments on standard benchmarks show that SALeRA performs well in practice, and compares favorably to the state of the art.

LGJun 10, 2017
Toward Optimal Run Racing: Application to Deep Learning Calibration

Olivier Bousquet, Sylvain Gelly, Karol Kurach et al.

This paper aims at one-shot learning of deep neural nets, where a highly parallel setting is considered to address the algorithm calibration problem - selecting the best neural architecture and learning hyper-parameter values depending on the dataset at hand. The notoriously expensive calibration problem is optimally reduced by detecting and early stopping non-optimal runs. The theoretical contribution regards the optimality guarantees within the multiple hypothesis testing framework. Experimentations on the Cifar10, PTB and Wiki benchmarks demonstrate the relevance of the approach with a principled and consistent improvement on the state of the art with no extra hyper-parameter.

AIOct 27, 2016
Anomaly Detection with the Voronoi Diagram Evolutionary Algorithm

Marti Luis, Fansi-Tchango Arsene, Navarro Laurent et al.

This paper presents the Voronoi diagram-based evolutionary algorithm (VorEAl). VorEAl partitions input space in abnormal/normal subsets using Voronoi diagrams. Diagrams are evolved using a multi-objective bio-inspired approach in order to conjointly optimize classification metrics while also being able to represent areas of the data space that are not present in the training dataset. As part of the paper VorEAl is experimentally validated and contrasted with similar approaches.

AIJun 19, 2014
Racing Multi-Objective Selection Probabilities

Gaétan Marceau, Marc Schoenauer

In the context of Noisy Multi-Objective Optimization, dealing with uncertainties requires the decision maker to define some preferences about how to handle them, through some statistics (e.g., mean, median) to be used to evaluate the qualities of the solutions, and define the corresponding Pareto set. Approximating these statistics requires repeated samplings of the population, drastically increasing the overall computational cost. To tackle this issue, this paper proposes to directly estimate the probability of each individual to be selected, using some Hoeffding races to dynamically assign the estimation budget during the selection step. The proposed racing approach is validated against static budget approaches with NSGA-II on noisy versions of the ZDT benchmark functions.

NEJun 10, 2014
Maximum Likelihood-based Online Adaptation of Hyper-parameters in CMA-ES

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag et al.

The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is widely accepted as a robust derivative-free continuous optimization algorithm for non-linear and non-convex optimization problems. CMA-ES is well known to be almost parameterless, meaning that only one hyper-parameter, the population size, is proposed to be tuned by the user. In this paper, we propose a principled approach called self-CMA-ES to achieve the online adaptation of CMA-ES hyper-parameters in order to improve its overall performance. Experimental results show that for larger-than-default population size, the default settings of hyper-parameters of CMA-ES are far from being optimal, and that self-CMA-ES allows for dynamically approaching optimal settings.

AISep 16, 2013
Computational Methods for Probabilistic Inference of Sector Congestion in Air Traffic Management

Gaétan Marceau, Pierre Savéant, Marc Schoenauer

This article addresses the issue of computing the expected cost functions from a probabilistic model of the air traffic flow and capacity management. The Clenshaw-Curtis quadrature is compared to Monte-Carlo algorithms defined specifically for this problem. By tailoring the algorithms to this model, we reduce the computational burden in order to simulate real instances. The study shows that the Monte-Carlo algorithm is more sensible to the amount of uncertainty in the system, but has the advantage to return a result with the associated accuracy on demand. The performances for both approaches are comparable for the computation of the expected cost of delay and the expected cost of congestion. Finally, this study shows some evidences that the simulation of the proposed probabilistic model is tractable for realistic instances.

AISep 16, 2013
Multiobjective Tactical Planning under Uncertainty for Air Traffic Flow and Capacity Management

Gaétan Marceau, Pierre Savéant, Marc Schoenauer

We investigate a method to deal with congestion of sectors and delays in the tactical phase of air traffic flow and capacity management. It relies on temporal objectives given for every point of the flight plans and shared among the controllers in order to create a collaborative environment. This would enhance the transition from the network view of the flow management to the local view of air traffic control. Uncertainty is modeled at the trajectory level with temporal information on the boundary points of the crossed sectors and then, we infer the probabilistic occupancy count. Therefore, we can model the accuracy of the trajectory prediction in the optimization process in order to fix some safety margins. On the one hand, more accurate is our prediction; more efficient will be the proposed solutions, because of the tighter safety margins. On the other hand, when uncertainty is not negligible, the proposed solutions will be more robust to disruptions. Furthermore, a multiobjective algorithm is used to find the tradeoff between the delays and congestion, which are antagonist in airspace with high traffic density. The flow management position can choose manually, or automatically with a preference-based algorithm, the adequate solution. This method is tested against two instances, one with 10 flights and 5 sectors and one with 300 flights and 16 sectors.

AISep 16, 2013
Strategic Planning in Air Traffic Control as a Multi-objective Stochastic Optimization Problem

Gaétan Marceau, Pierre Savéant, Marc Schoenauer

With the objective of handling the airspace sector congestion subject to continuously growing air traffic, we suggest to create a collaborative working plan during the strategic phase of air traffic control. The plan obtained via a new decision support tool presented in this article consists in a schedule for controllers, which specifies time of overflight on the different waypoints of the flight plans. In order to do it, we believe that the decision-support tool shall model directly the uncertainty at a trajectory level in order to propagate the uncertainty to the sector level. Then, the probability of congestion for any sector in the airspace can be computed. Since air traffic regulations and sector congestion are antagonist, we designed and implemented a multi-objective optimization algorithm for determining the best trade-off between these two criteria. The solution comes up as a set of alternatives for the multi-sector planner where the severity of the congestion cost is adjustable. In this paper, the Non-dominated Sorting Genetic Algorithm (NSGA-II) was used to solve an artificial benchmark problem involving 24 aircraft and 11 sectors, and is able to provide a good approximation of the Pareto front.

LGAug 12, 2013
KL-based Control of the Learning Schedule for Surrogate Black-Box Optimization

Ilya Loshchilov, Marc Schoenauer, Michèle Sebag

This paper investigates the control of an ML component within the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) devoted to black-box optimization. The known CMA-ES weakness is its sample complexity, the number of evaluations of the objective function needed to approximate the global optimum. This weakness is commonly addressed through surrogate optimization, learning an estimate of the objective function a.k.a. surrogate model, and replacing most evaluations of the true objective function with the (inexpensive) evaluation of the surrogate model. This paper presents a principled control of the learning schedule (when to relearn the surrogate model), based on the Kullback-Leibler divergence of the current search distribution and the training distribution of the former surrogate model. The experimental validation of the proposed approach shows significant performance gains on a comprehensive set of ill-conditioned benchmark problems, compared to the best state of the art including the quasi-Newton high-precision BFGS method.

AIMay 10, 2013
Quality Measures of Parameter Tuning for Aggregated Multi-Objective Temporal Planning

Mostepha Redouane Khouadjia, Marc Schoenauer, Vincent Vidal et al.

Parameter tuning is recognized today as a crucial ingredient when tackling an optimization problem. Several meta-optimization methods have been proposed to find the best parameter set for a given optimization algorithm and (set of) problem instances. When the objective of the optimization is some scalar quality of the solution given by the target algorithm, this quality is also used as the basis for the quality of parameter sets. But in the case of multi-objective optimization by aggregation, the set of solutions is given by several single-objective runs with different weights on the objectives, and it turns out that the hypervolume of the final population of each single-objective run might be a better indicator of the global performance of the aggregation method than the best fitness in its population. This paper discusses this issue on a case study in multi-objective temporal planning using the evolutionary planner DaE-YAHSP and the meta-optimizer ParamILS. The results clearly show how ParamILS makes a difference between both approaches, and demonstrate that indeed, in this context, using the hypervolume indicator as ParamILS target is the best choice. Other issues pertaining to parameter tuning in the proposed context are also discussed.

AIMay 6, 2013
Multi-Objective AI Planning: Comparing Aggregation and Pareto Approaches

Mostepha Redouane Khouadjia, Marc Schoenauer, Vincent Vidal et al.

Most real-world Planning problems are multi-objective, trying to minimize both the makespan of the solution plan, and some cost of the actions involved in the plan. But most, if not all existing approaches are based on single-objective planners, and use an aggregation of the objectives to remain in the single-objective context. Divide and Evolve (DaE) is an evolutionary planner that won the temporal deterministic satisficing track at the last International Planning Competitions (IPC). Like all Evolutionary Algorithms (EA), it can easily be turned into a Pareto-based Multi-Objective EA. It is however important to validate the resulting algorithm by comparing it with the aggregation approach: this is the goal of this paper. The comparative experiments on a recently proposed benchmark set that are reported here demonstrate the usefulness of going Pareto-based in AI Planning.

NEApr 10, 2013
Sustainable Cooperative Coevolution with a Multi-Armed Bandit

François-Michel De Rainville, Michèle Sebag, Christian Gagné et al.

This paper proposes a self-adaptation mechanism to manage the resources allocated to the different species comprising a cooperative coevolutionary algorithm. The proposed approach relies on a dynamic extension to the well-known multi-armed bandit framework. At each iteration, the dynamic multi-armed bandit makes a decision on which species to evolve for a generation, using the history of progress made by the different species to guide the decisions. We show experimentally, on a benchmark and a real-world problem, that evolving the different populations at different paces allows not only to identify solutions more rapidly, but also improves the capacity of cooperative coevolution to solve more complex problems.