GNJan 21, 2017
Asymptotic efficiency of the proportional compensation scheme for a large number of producersDmitry B. Rokhlin, Anatoly Usov
We consider a manager, who allocates some fixed total payment amount between $N$ rational agents in order to maximize the aggregate production. The profit of $i$-th agent is the difference between the compensation (reward) obtained from the manager and the production cost. We compare (i) the \emph{normative} compensation scheme, where the manager enforces the agents to follow an optimal cooperative strategy; (ii) the \emph{linear piece rates} compensation scheme, where the manager announces an optimal reward per unit good; (iii) the \emph{proportional} compensation scheme, where agent's reward is proportional to his contribution to the total output. Denoting the correspondent total production levels by $s^*$, $\hat s$ and $\overline s$ respectively, where the last one is related to the unique Nash equilibrium, we examine the limits of the prices of anarchy $\mathscr A_N=s^*/\overline s$, $\mathscr A_N'=\hat s/\overline s$ as $N\to\infty$. These limits are calculated for the cases of identical convex costs with power asymptotics at the origin, and for power costs, corresponding to the Coob-Douglas and generalized CES production functions with decreasing returns to scale. Our results show that asymptotically no performance is lost in terms of $\mathscr A'_N$, and in terms of $\mathscr A_N$ the loss does not exceed $31\%$.
LGApr 27
Dynamic Regret for Online Regression in RKHS via Discounted VAW and Subspace ApproximationDmitry B. Rokhlin, Georgiy A. Karapetyants
We study online regression with the square loss in a reproducing kernel Hilbert space under a dynamic regret criterion. The learner is compared with a time-varying comparator sequence, and the bounds depend on its path length in the RKHS norm. The proposed method transfers the finite-dimensional discounted Vovk--Azoury--Warmuth approach of Jacobsen \& Cutkosky (2024) to the RKHS setting by means of finite-dimensional subspace approximations. For a fixed subspace, we run a VAW-based ensemble of discounted VAW forecasters over a geometric grid of discount factors. The additional approximation error is controlled by the uniform projection error of kernel sections. We then introduce a general orthogonal truncation method: starting from a feature expansion of the kernel, we construct the associated RKHS by introducing an inner product that makes the feature functions orthonormal, and then use the spans of the first basis functions as finite-dimensional approximation spaces. The resulting subspace reduction is applied to several approximation schemes. Explicit feature expansions yield fast-regime bounds for Gaussian and analytic dot-product kernels. Mercer truncations provide a spectral approximation method and lead to dynamic regret bounds in fast and slow regimes, depending on the eigenvalue decay. Finally, we study subspaces spanned by kernel sections and apply this construction to Matérn kernels.
LGMar 25, 2025
Random feature-based double Vovk-Azoury-Warmuth algorithm for online multi-kernel learningDmitry B. Rokhlin, Olga V. Gurtovaya
We introduce a novel multi-kernel learning algorithm, VAW$^2$, for online least squares regression in reproducing kernel Hilbert spaces (RKHS). VAW$^2$ leverages random Fourier feature-based functional approximation and the Vovk-Azoury-Warmuth (VAW) method in a two-level procedure: VAW is used to construct expert strategies from random features generated for each kernel at the first level, and then again to combine their predictions at the second level. A theoretical analysis yields a regret bound of $O(T^{1/2}\ln T)$ in expectation with respect to artificial randomness, when the number of random features scales as $T^{1/2}$. Empirical results on some benchmark datasets demonstrate that VAW$^2$ achieves superior performance compared to the existing online multi-kernel learning algorithms: Raker and OMKL-GF, and to other theoretically grounded method methods involving convex combination of expert predictions at the second level.
LGJun 27, 2025
A hierarchical Vovk-Azoury-Warmuth forecaster with discounting for online regression in RKHSDmitry B. Rokhlin
We study the problem of online regression with the unconstrained quadratic loss against a time-varying sequence of functions from a Reproducing Kernel Hilbert Space (RKHS). Recently, Jacobsen and Cutkosky (2024) introduced a discounted Vovk-Azoury-Warmuth (DVAW) forecaster that achieves optimal dynamic regret in the finite-dimensional case. In this work, we lift their approach to the non-parametric domain by synthesizing the DVAW framework with a random feature approximation. We propose a fully adaptive, hierarchical algorithm, which we call H-VAW-D (Hierarchical Vovk-Azoury-Warmuth with Discounting), that learns both the discount factor and the number of random features. We prove that this algorithm, which has a per-iteration computational complexity of $O(T\ln T)$, achieves an expected dynamic regret of $O(T^{2/3}P_T^{1/3} + \sqrt{T}\ln T)$, where $P_T$ is the functional path length of a comparator sequence.
MLAug 1, 2018
Robbins-Monro conditions for persistent exploration learning strategiesDmitry B. Rokhlin
We formulate simple assumptions, implying the Robbins-Monro conditions for the $Q$-learning algorithm with the local learning rate, depending on the number of visits of a particular state-action pair (local clock) and the number of iteration (global clock). It is assumed that the Markov decision process is communicating and the learning policy ensures the persistent exploration. The restrictions are imposed on the functional dependence of the learning rate on the local and global clocks. The result partially confirms the conjecture of Bradkte (1994).
LGMay 2, 2017
PDE approach to the problem of online prediction with expert advice: a construction of potential-based strategiesDmitry B. Rokhlin
We consider a sequence of repeated prediction games and formally pass to the limit. The supersolutions of the resulting non-linear parabolic partial differential equation are closely related to the potential functions in the sense of N.\,Cesa-Bianci, G.\,Lugosi (2003). Any such supersolution gives an upper bound for forecaster's regret and suggests a potential-based prediction strategy, satisfying the Blackwell condition. A conventional upper bound for the worst-case regret is justified by a simple verification argument.
LGMay 11, 2016
Asymptotic sequential Rademacher complexity of a finite function classDmitry B. Rokhlin
For a finite function class we describe the large sample limit of the sequential Rademacher complexity in terms of the viscosity solution of a $G$-heat equation. In the language of Peng's sublinear expectation theory, the same quantity equals to the expected value of the largest order statistics of a multidimensional $G$-normal random variable. We illustrate this result by deriving upper and lower bounds for the asymptotic sequential Rademacher complexity.