Martin Zubeldia

LG
h-index10
4papers
32citations
Novelty59%
AI Score42

4 Papers

LGMar 28, 2023
Concentration of Contractive Stochastic Approximation: Additive and Multiplicative Noise

Zaiwei Chen, Siva Theja Maguluri, Martin Zubeldia

In this paper, we establish maximal concentration bounds for the iterates generated by a stochastic approximation (SA) algorithm under a contractive operator with respect to some arbitrary norm (for example, the $\ell_\infty$-norm). We consider two settings where the iterates are potentially unbounded: SA with bounded multiplicative noise and SA with sub-Gaussian additive noise. Our maximal concentration inequalities state that the convergence error has a sub-Gaussian tail in the additive noise setting and a Weibull tail (which is faster than polynomial decay but could be slower than exponential decay) in the multiplicative noise setting. In addition, we provide an impossibility result showing that it is generally impossible to have sub-exponential tails under multiplicative noise. To establish the maximal concentration bounds, we develop a novel bootstrapping argument that involves bounding the moment-generating function of a modified version of the generalized Moreau envelope of the convergence error and constructing an exponential supermartingale to enable using Ville's maximal inequality. We demonstrate the applicability of our theoretical results in the context of linear SA and reinforcement learning.

91.6PRMay 20
Concentration of General Stochastic Approximation Under Heavy-Tailed Markovian Noise

Shubhada Agrawal, Siva Theja Maguluri, Martin Zubeldia

We establish maximal concentration bounds for the iterates generated by stochastic approximation algorithms with general step sizes, where the noise has a finite-state Markovian component plus a Martingale-difference component. When the Martingale-difference noise is bounded, we show that the tail of the error can be sub-Gaussian, sub-Weibull, or something lighter than any Pareto but heavier than any Weibull, depending on the step size sequence and on whether the random operator is almost surely contractive, almost surely non-expansive, or expansive with positive probability. Our analysis relies on a novel Lyapunov function involving the moment-generating function of the solution to a Poisson equation, together with an auxiliary projected algorithm. We complement the upper bounds with worst-case examples showing that qualitatively sharper bounds are impossible. We further study the case of unbounded Martingale-difference noise when the average operator is contractive, and the step sizes are of order $1/k$. In this setting, we show that if the random operator is almost surely non-expansive, then the error tail is at most three times heavier than the noise tail, whereas if the random operator is expansive with positive probability, then the error may have substantially heavier tails. These results are obtained through a novel black-box truncation argument that reduces the unbounded-noise setting to the bounded-noise case.

MLMay 27, 2025
A General-Purpose Theorem for High-Probability Bounds of Stochastic Approximation with Polyak Averaging

Sajad Khodadadian, Martin Zubeldia

Polyak-Ruppert averaging is a widely used technique to achieve the optimal asymptotic variance of stochastic approximation (SA) algorithms, yet its high-probability performance guarantees remain underexplored in general settings. In this paper, we present a general framework for establishing non-asymptotic concentration bounds for the error of averaged SA iterates. Our approach assumes access to individual concentration bounds for the unaveraged iterates and yields a sharp bound on the averaged iterates. We also construct an example, showing the tightness of our result up to constant multiplicative factors. As direct applications, we derive tight concentration bounds for contractive SA algorithms and for algorithms such as temporal difference learning and Q-learning with averaging, obtaining new bounds in settings where traditional analysis is challenging.

LGFeb 22, 2018
Collaboratively Learning the Best Option, Using Bounded Memory

Lili Su, Martin Zubeldia, Nancy Lynch

We consider multi-armed bandit problems in social groups wherein each individual has bounded memory and shares the common goal of learning the best arm/option. We say an individual learns the best option if eventually (as $t \to \infty$) it pulls only the arm with the highest average reward. While this goal is provably impossible for an isolated individual, we show that, in social groups, this goal can be achieved easily with the aid of social persuasion, i.e., communication. Specifically, we study the learning dynamics wherein an individual sequentially decides on which arm to pull next based on not only its private reward feedback but also the suggestions provided by randomly chosen peers. Our learning dynamics are hard to analyze via explicit probabilistic calculations due to the stochastic dependency induced by social interaction. Instead, we employ the mean-field approximation method from statistical physics and we show: (1) With probability $\to 1$ as the social group size $N \to \infty $, every individual in the social group learns the best option. (2) Over an arbitrary finite time horizon $[0, T]$, with high probability (in $N$), the fraction of individuals that prefer the best option grows to 1 exponentially fast as $t$ increases ($t\in [0, T]$). A major innovation of our mean-filed analysis is a simple yet powerful technique to deal with absorbing states in the interchange of limits $N \to \infty$ and $t \to \infty $. The mean-field approximation method allows us to approximate the probabilistic sample paths of our learning dynamics by a deterministic and smooth trajectory that corresponds to the unique solution of a well-behaved system of ordinary differential equations (ODEs). Such an approximation is desired because the analysis of a system of ODEs is relatively easier than that of the original stochastic system.