PRMar 5, 2016
Branching diffusion representation of semilinear PDEs and Monte Carlo approximationPierre Henry-Labordere, Nadia Oudjane, Xiaolu Tan et al.
We provide a representation result of parabolic semi-linear PD-Es, with polynomial nonlinearity, by branching diffusion processes. We extend the classical representation for KPP equations, introduced by Skorokhod (1964), Watanabe (1965) and McKean (1975), by allowing for polynomial nonlinearity in the pair $(u, Du)$, where $u$ is the solution of the PDE with space gradient $Du$. Similar to the previous literature, our result requires a non-explosion condition which restrict to "small maturity" or "small nonlinearity" of the PDE. Our main ingredient is the automatic differentiation technique as in Henry Labordere, Tan and Touzi (2015), based on the Malliavin integration by parts, which allows to account for the nonlinearities in the gradient. As a consequence, the particles of our branching diffusion are marked by the nature of the nonlinearity. This new representation has very important numerical implications as it is suitable for Monte Carlo simulation. Indeed, this provides the first numerical method for high dimensional nonlinear PDEs with error estimate induced by the dimension-free Central limit theorem. The complexity is also easily seen to be of the order of the squared dimension. The final section of this paper illustrates the efficiency of the algorithm by some high dimensional numerical experiments.
NAAug 18, 2010
Snell envelope with path dependent multiplicative optimality criteriaPierre Del Moral, Peng Hu, Nadia Oudjane
We analyze the Snell envelope with path dependent multiplicative optimality criteria. Especially for this case, we propose a variation of the Snell envelope backward recursion which allows to extend some classical approxima- tion schemes to the multiplicatively path dependent case. In this framework, we propose an importance sampling particle approximation scheme based on a specific change of measure, designed to concentrate the computational effort in regions pointed out by the criteria. This new algorithm is theoritically studied. We provide non asymptotic convergence estimates and prove that the resulting estimator is high biased.
OCFeb 16, 2023
Reimagining Demand-Side Management with Mean Field LearningBianca Marin Moreno, Margaux Brégère, Pierre Gaillard et al.
Integrating renewable energy into the power grid while balancing supply and demand is a complex issue, given its intermittent nature. Demand side management (DSM) offers solutions to this challenge. We propose a new method for DSM, in particular the problem of controlling a large population of electrical devices to follow a desired consumption signal. We model it as a finite horizon Markovian mean field control problem. We develop a new algorithm, MD-MFC, which provides theoretical guarantees for convex and Lipschitz objective functions. What distinguishes MD-MFC from the existing load control literature is its effectiveness in directly solving the target tracking problem without resorting to regularization techniques on the main problem. A non-standard Bregman divergence on a mirror descent scheme allows dynamic programming to be used to obtain simple closed-form solutions. In addition, we show that general mean-field game algorithms can be applied to this problem, which expands the possibilities for addressing load control problems. We illustrate our claims with experiments on a realistic data set.
LGMay 12, 2025
Online Episodic Convex Reinforcement LearningBianca Marin Moreno, Khaled Eldowa, Pierre Gaillard et al.
We study online learning in episodic finite-horizon Markov decision processes (MDPs) with convex objective functions, known as the concave utility reinforcement learning (CURL) problem. This setting generalizes RL from linear to convex losses on the state-action distribution induced by the agent's policy. The non-linearity of CURL invalidates classical Bellman equations and requires new algorithmic approaches. We introduce the first algorithm achieving near-optimal regret bounds for online CURL without any prior knowledge on the transition function. To achieve this, we use an online mirror descent algorithm with varying constraint sets and a carefully designed exploration bonus. We then address for the first time a bandit version of CURL, where the only feedback is the value of the objective function on the state-action distribution induced by the agent's policy. We achieve a sub-linear regret bound for this more challenging problem by adapting techniques from bandit convex optimization to the MDP setting.
OCAug 7, 2019
A Privacy-preserving Method to Optimize Distributed Resource AllocationOlivier Beaude, Pascal Benchimol, Stéphane Gaubert et al.
We consider a resource allocation problem involving a large number of agents with individual constraints subject to privacy, and a central operator whose objective is to optimize a global, possibly nonconvex, cost while satisfying the agents' constraints, for instance an energy operator in charge of the management of energy consumption flexibilities of many individual consumers. We provide a privacy-preserving algorithm that does compute the optimal allocation of resources, avoiding each agent to reveal her private information (constraints and individual solution profile) neither to the central operator nor to a third party. Our method relies on an aggregation procedure: we compute iteratively a global allocation of resources, and gradually ensure existence of a disaggregation, that is individual profiles satisfying agents' private constraints, by a protocol involving the generation of polyhedral cuts and secure multiparty computations (SMC). To obtain these cuts, we use an alternate projection method, which is implemented locally by each agent, preserving her privacy needs. We adress especially the case in which the local and global constraints define a transportation polytope. Then, we provide theoretical convergence estimates together with numerical results, showing that the algorithm can be effectively used to solve the allocation problem in high dimension, while addressing privacy issues.