Vladislav B. Tadić

ST
h-index16
3papers
157citations
Novelty40%
AI Score22

3 Papers

25.7MLFeb 4, 2020
Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

Maxim Kaledin, Eric Moulines, Alexey Naumov et al.

Linear two-timescale stochastic approximation (SA) scheme is an important class of algorithms which has become popular in reinforcement learning (RL), particularly for the policy evaluation problem. Recently, a number of works have been devoted to establishing the finite time analysis of the scheme, especially under the Markovian (non-i.i.d.) noise settings that are ubiquitous in practice. In this paper, we provide a finite-time analysis for linear two timescale SA. Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise, only the constants are affected by the mixing time of the Markov chain. With an appropriate step size schedule, the transient term in the expected error bound is $o(1/k^c)$ and the steady-state term is ${\cal O}(1/k)$, where $c>1$ and $k$ is the iteration number. Furthermore, we present an asymptotic expansion of the expected error with a matching lower bound of $Ω(1/k)$. A simple numerical experiment is presented to support our theory.

2.3STJun 25, 2018
Asymptotic Properties of Recursive Maximum Likelihood Estimation in Non-Linear State-Space Models

Vladislav Z. B. Tadic, Arnaud Doucet

Using stochastic gradient search and the optimal filter derivative, it is possible to perform recursive (i.e., online) maximum likelihood estimation in a non-linear state-space model. As the optimal filter and its derivative are analytically intractable for such a model, they need to be approximated numerically. In [Poyiadjis, Doucet and Singh, Biometrika 2018], a recursive maximum likelihood algorithm based on a particle approximation to the optimal filter derivative has been proposed and studied through numerical simulations. Here, this algorithm and its asymptotic behavior are analyzed theoretically. We show that the algorithm accurately estimates maxima to the underlying (average) log-likelihood when the number of particles is sufficiently large. We also derive (relatively) tight bounds on the estimation error. The obtained results hold under (relatively) mild conditions and cover several classes of non-linear state-space models met in practice.

19.8STAug 30, 2017
Asymptotic Bias of Stochastic Gradient Search

Vladislav B. Tadic, Arnaud Doucet

The asymptotic behavior of the stochastic gradient algorithm with a biased gradient estimator is analyzed. Relying on arguments based on the dynamic system theory (chain-recurrence) and the differential geometry (Yomdin theorem and Lojasiewicz inequality), tight bounds on the asymptotic bias of the iterates generated by such an algorithm are derived. The obtained results hold under mild conditions and cover a broad class of high-dimensional nonlinear algorithms. Using these results, the asymptotic properties of the policy-gradient (reinforcement) learning and adaptive population Monte Carlo sampling are studied. Relying on the same results, the asymptotic behavior of the recursive maximum split-likelihood estimation in hidden Markov models is analyzed, too.