Learning Operators by Regularized Stochastic Gradient Descent with Operator-valued Kernels
This work addresses ill-posed inverse problems in machine learning, providing theoretical guarantees for operator learning, but it is incremental as it builds on existing SGD and kernel methods.
The paper tackles the estimation of regression operators in statistical inverse problems using regularized stochastic gradient descent with operator-valued kernels, achieving near-optimal convergence rates and high-probability guarantees for prediction and estimation errors.
We consider a class of statistical inverse problems involving the estimation of a regression operator from a Polish space to a separable Hilbert space, where the target lies in a vector-valued reproducing kernel Hilbert space induced by an operator-valued kernel. To address the associated ill-posedness, we analyze regularized stochastic gradient descent (SGD) algorithms in both online and finite-horizon settings. The former uses polynomially decaying step sizes and regularization parameters, while the latter adopts fixed values. Under suitable structural and distributional assumptions, we establish dimension-independent bounds for prediction and estimation errors. The resulting convergence rates are near-optimal in expectation, and we also derive high-probability estimates that imply almost sure convergence. Our analysis introduces a general technique for obtaining high-probability guarantees in infinite-dimensional settings. Possible extensions to broader kernel classes and encoder-decoder structures are briefly discussed.