ML LG FA OC STJul 3, 2017

Generalization Properties of Doubly Stochastic Learning Algorithms

arXiv:1707.00577v28 citations

AI Analysis

This work addresses a theoretical gap for researchers in scalable kernel methods, though it is incremental as it builds on prior analyses with refinements.

The paper tackles the lack of theoretical understanding of generalization properties in doubly stochastic learning algorithms, deriving convergence results for both unregularized and regularized variants in nonparametric regression, with the unregularized results being the first of their kind and the regularized ones improving existing literature.

Doubly stochastic learning algorithms are scalable kernel methods that perform very well in practice. However, their generalization properties are not well understood and their analysis is challenging since the corresponding learning sequence may not be in the hypothesis space induced by the kernel. In this paper, we provide an in-depth theoretical analysis for different variants of doubly stochastic learning algorithms within the setting of nonparametric regression in a reproducing kernel Hilbert space and considering the square loss. Particularly, we derive convergence results on the generalization error for the studied algorithms either with or without an explicit penalty term. To the best of our knowledge, the derived results for the unregularized variants are the first of this kind, while the results for the regularized variants improve those in the literature. The novelties in our proof are a sample error bound that requires controlling the trace norm of a cumulative operator, and a refined analysis of bounding initial error.

View on arXiv PDF

Similar