Pascal Bianchi

12papers

180citations

Novelty52%

AI Score29

Ranked #152,024 of 201,326 authors (top 76%)#473 in OC (top 72%)

12 Papers

OCDec 2, 2013

Convergence of a Multi-Agent Projected Stochastic Gradient Algorithm for Non-Convex Optimization

Pascal Bianchi, Jérémie Jakubowicz

We introduce a new framework for the convergence analysis of a class of distributed constrained non-convex optimization algorithms in multi-agent systems. The aim is to search for local minimizers of a non-convex objective function which is supposed to be a sum of local utility functions of the agents. The algorithm under study consists of two steps: a local stochastic gradient descent at each agent and a gossip step that drives the network of agents to a consensus. Under the assumption of decreasing stepsize, it is proved that consensus is asymptotically achieved in the network and that the algorithm converges to the set of Karush-Kuhn-Tucker points. As an important feature, the algorithm does not require the double-stochasticity of the gossip matrices. It is in particular suitable for use in a natural broadcast scenario for which no feedback messages between agents are required. It is proved that our result also holds if the number of communications in the network per unit of time vanishes at moderate speed as time increases, allowing for potential savings of the network's energy. Applications to power allocation in wireless ad-hoc networks are discussed. Finally, we provide numerical results which sustain our claims.

OCDec 2, 2013

Performance of a Distributed Stochastic Approximation Algorithm

Pascal Bianchi, Gersende Fort, Walid Hachem

In this paper, a distributed stochastic approximation algorithm is studied. Applications of such algorithms include decentralized estimation, optimization, control or computing. The algorithm consists in two steps: a local step, where each node in a network updates a local estimate using a stochastic approximation algorithm with decreasing step size, and a gossip step, where a node computes a local weighted average between its estimates and those of its neighbors. Convergence of the estimates toward a consensus is established under weak assumptions. The approach relies on two main ingredients: the existence of a Lyapunov function for the mean field in the agreement subspace, and a contraction property of the random matrices of weights in the subspace orthogonal to the agreement subspace. A second order analysis of the algorithm is also performed under the form of a Central Limit Theorem. The Polyak-averaged version of the algorithm is also considered.

LGJun 17, 2024

Long-time asymptotics of noisy SVGD outside the population limit

Victor Priser, Pascal Bianchi, Adil Salim

Stein Variational Gradient Descent (SVGD) is a widely used sampling algorithm that has been successfully applied in several areas of Machine Learning. SVGD operates by iteratively moving a set of interacting particles (which represent the samples) to approximate the target distribution. Despite recent studies on the complexity of SVGD and its variants, their long-time asymptotic behavior (i.e., after numerous iterations ) is still not understood in the finite number of particles regime. We study the long-time asymptotic behavior of a noisy variant of SVGD. First, we establish that the limit set of noisy SVGD for large is well-defined. We then characterize this limit set, showing that it approaches the target distribution as increases. In particular, noisy SVGD provably avoids the variance collapse observed for SVGD. Our approach involves demonstrating that the trajectories of noisy SVGD closely resemble those described by a McKean-Vlasov process.

OCAug 4, 2021

Stochastic Subgradient Descent Escapes Active Strict Saddles on Weakly Convex Functions

Pascal Bianchi, Walid Hachem, Sholom Schechtman

In non-smooth stochastic optimization, we establish the non-convergence of the stochastic subgradient descent (SGD) to the critical points recently called active strict saddles by Davis and Drusvyatskiy. Such points lie on a manifold $M$ where the function $f$ has a direction of second-order negative curvature. Off this manifold, the norm of the Clarke subdifferential of $f$ is lower-bounded. We require two conditions on $f$. The first assumption is a Verdier stratification condition, which is a refinement of the popular Whitney stratification. It allows us to establish a reinforced version of the projection formula of Bolte \emph{et.al.} for Whitney stratifiable functions, and which is of independent interest. The second assumption, termed the angle condition, allows to control the distance of the iterates to $M$. When $f$ is weakly convex, our assumptions are generic. Consequently, generically in the class of definable weakly convex functions, the SGD converges to a local minimizer.

LGJun 14, 2021

Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

Anas Barakat, Pascal Bianchi, Julien Lehmann

Actor-critic methods integrating target networks have exhibited a stupendous empirical success in deep reinforcement learning. However, a theoretical understanding of the use of target networks in actor-critic methods is largely missing in the literature. In this paper, we reduce this gap between theory and practice by proposing the first theoretical analysis of an online target-based actor-critic algorithm with linear function approximation in the discounted reward setting. Our algorithm uses three different timescales: one for the actor and two for the critic. Instead of using the standard single timescale temporal difference (TD) learning algorithm as a critic, we use a two timescales target-based version of TD learning closely inspired from practical actor-critic algorithms implementing target networks. First, we establish asymptotic convergence results for both the critic and the actor under Markovian sampling. Then, we provide a finite-time analysis showing the impact of incorporating a target network into actor-critic methods.

MEJun 23, 2020

Conditional independence testing via weighted partial copulas and nearest neighbors

Pascal Bianchi, Kevin Elgui, François Portier

This paper introduces the \textit{weighted partial copula} function for testing conditional independence. The proposed test procedure results from these two ingredients: (i) the test statistic is an explicit Cramer-von Mises transformation of the \textit{weighted partial copula}, (ii) the regions of rejection are computed using a bootstrap procedure which mimics conditional independence by generating samples from the product measure of the estimated conditional marginals. Under conditional independence, the weak convergence of the \textit{weighted partial copula proces}s is established when the marginals are estimated using a smoothed local linear estimator. Finally, an experimental section demonstrates that the proposed test has competitive power compared to recent state-of-the-art methods such as kernel-based test.

OCNov 18, 2019

Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Non Convex Optimization

Anas Barakat, Pascal Bianchi

Although ADAM is a very popular algorithm for optimizing the weights of neural networks, it has been recently shown that it can diverge even in simple convex optimization examples. Several variants of ADAM have been proposed to circumvent this convergence issue. In this work, we study the ADAM algorithm for smooth nonconvex optimization under a boundedness assumption on the adaptive learning rate. The bound on the adaptive step size depends on the Lipschitz constant of the gradient of the objective function and provides safe theoretical adaptive step sizes. Under this boundedness assumption, we show a novel first order convergence rate result in both deterministic and stochastic contexts. Furthermore, we establish convergence rates of the function value sequence using the Kurdyka-Lojasiewicz property.

OCJan 23, 2019

A Fully Stochastic Primal-Dual Algorithm

Pascal Bianchi, Walid Hachem, Adil Salim

A new stochastic primal--dual algorithm for solving a composite optimization problem is proposed. It is assumed that all the functions/operators that enter the optimization problem are given as statistical expectations. These expectations are unknown but revealed across time through i.i.d. realizations. The proposed algorithm is proven to converge to a saddle point of the Lagrangian function. In the framework of the monotone operator theory, the convergence proof relies on recent results on the stochastic Forward Backward algorithm involving random monotone operators. An example of convex optimization under stochastic linear constraints is considered.

MLOct 4, 2018

Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization

Anas Barakat, Pascal Bianchi

Adam is a popular variant of stochastic gradient descent for finding a local minimizer of a function. In the constant stepsize regime, assuming that the objective function is differentiable and non-convex, we establish the convergence in the long run of the iterates to a stationary point under a stability condition. The key ingredient is the introduction of a continuous-time version of Adam, under the form of a non-autonomous ordinary differential equation. This continuous-time system is a relevant approximation of the Adam iterates, in the sense that the interpolated Adam process converges weakly towards the solution to the ODE. The existence and the uniqueness of the solution are established. We further show the convergence of the solution towards the critical points of the objective function and quantify its convergence rate under a Lojasiewicz assumption. Then, we introduce a novel decreasing stepsize version of Adam. Under mild assumptions, it is shown that the iterates are almost surely bounded and converge almost surely to critical points of the objective function. Finally, we analyze the fluctuations of the algorithm by means of a conditional central limit theorem.

OCApr 3, 2018

A Constant Step Stochastic Douglas-Rachford Algorithm with Application to Non Separable Regularizations

Adil Salim, Pascal Bianchi, Walid Hachem

The Douglas Rachford algorithm is an algorithm that converges to a minimizer of a sum of two convex functions. The algorithm consists in fixed point iterations involving computations of the proximity operators of the two functions separately. The paper investigates a stochastic version of the algorithm where both functions are random and the step size is constant. We establish that the iterates of the algorithm stay close to the set of solution with high probability when the step size is small enough. Application to structured regularization is considered.

OCDec 19, 2017

Snake: a Stochastic Proximal Gradient Algorithm for Regularized Problems over Large Graphs

Adil Salim, Pascal Bianchi, Walid Hachem

A regularized optimization problem over a large unstructured graph is studied, where the regularization term is tied to the graph geometry. Typical regularization examples include the total variation and the Laplacian regularizations over the graph. When applying the proximal gradient algorithm to solve this problem, there exist quite affordable methods to implement the proximity operator (backward step) in the special case where the graph is a simple path without loops. In this paper, an algorithm, referred to as "Snake", is proposed to solve such regularized problems over general graphs, by taking benefit of these fast methods. The algorithm consists in properly selecting random simple paths in the graph and performing the proximal gradient algorithm over these simple paths. This algorithm is an instance of a new general stochastic proximal gradient algorithm, whose convergence is proven. Applications to trend filtering and graph inpainting are provided among others. Numerical experiments are conducted over large graphs.

OCJul 25, 2016

Ergodic convergence of a stochastic proximal point algorithm

Pascal Bianchi

The purpose of this paper is to establish the almost sure weak ergodic convergence of a sequence of iterates $(x_n)$ given by $x_{n+1} = (I+λ_n A(ξ_{n+1},\,.\,))^{-1}(x_n)$ where $(A(s,\,.\,):s\in E)$ is a collection of maximal monotone operators on a separable Hilbert space, $(ξ_n)$ is an independent identically distributed sequence of random variables on $E$ and $(λ_n)$ is a positive sequence in $\ell^2\backslash \ell^1$. The weighted averaged sequence of iterates is shown to converge weakly to a zero (assumed to exist) of the Aumann expectation ${\mathbb E}(A(ξ_1,\,.\,))$ under the assumption that the latter is maximal. We consider applications to stochastic optimization problems of the form $\min {\mathbb E}(f(ξ_1,x))$ w.r.t. $x\in \bigcap_{i=1}^m X_i$ where $f$ is a normal convex integrand and $(X_i)$ is a collection of closed convex sets. In this case, the iterations are closely related to a stochastic proximal algorithm recently proposed by Wang and Bertsekas.