Andrew Kurdila

2papers

2 Papers

SYSep 14, 2023
Rates of Convergence in Certain Native Spaces of Approximations used in Reinforcement Learning

Ali Bouland, Shengyuan Niu, Sai Tej Paruchuri et al.

This paper studies convergence rates for some value function approximations that arise in a collection of reproducing kernel Hilbert spaces (RKHS) $H(Ω)$. By casting an optimal control problem in a specific class of native spaces, strong rates of convergence are derived for the operator equation that enables offline approximations that appear in policy iteration. Explicit upper bounds on error in value function and controller approximations are derived in terms of power function $\mathcal{P}_{H,N}$ for the space of finite dimensional approximants $H_N$ in the native space $H(Ω)$. These bounds are geometric in nature and refine some well-known, now classical results concerning convergence of approximations of value functions.

MLMar 30, 2020
Learning Theory for Estimation of Animal Motion Submanifolds

Nathan Powell, Andrew Kurdila

This paper describes the formulation and experimental testing of a novel method for the estimation and approximation of submanifold models of animal motion. It is assumed that the animal motion is supported on a configuration manifold $Q$ that is a smooth, connected, regularly embedded Riemannian submanifold of Euclidean space $X\approx \mathbb{R}^d$ for some $d>0$, and that the manifold $Q$ is homeomorphic to a known smooth, Riemannian manifold $S$. Estimation of the manifold is achieved by finding an unknown mapping $γ:S\rightarrow Q\subset X$ that maps the manifold $S$ into $Q$. The overall problem is cast as a distribution-free learning problem over the manifold of measurements $\mathbb{Z}=S\times X$. That is, it is assumed that experiments generate a finite sets $\{(s_i,x_i)\}_{i=1}^m\subset \mathbb{Z}^m$ of samples that are generated according to an unknown probability density $μ$ on $\mathbb{Z}$. This paper derives approximations $γ_{n,m}$ of $γ$ that are based on the $m$ samples and are contained in an $N(n)$ dimensional space of approximants. The paper defines sufficient conditions that shows that the rates of convergence in $L^2_μ(S)$ correspond to those known for classical distribution-free learning theory over Euclidean space. Specifically, the paper derives sufficient conditions that guarantee rates of convergence that have the form $$\mathbb{E} \left (\|γ_μ^j-γ_{n,m}^j\|_{L^2_μ(S)}^2\right )\leq C_1 N(n)^{-r} + C_2 \frac{N(n)\log(N(n))}{m}$$for constants $C_1,C_2$ with $γ_μ:=\{γ^1_μ,\ldots,γ^d_μ\}$ the regressor function $γ_μ:S\rightarrow Q\subset X$ and $γ_{n,m}:=\{γ^1_{n,j},\ldots,γ^d_{n,m}\}$.