LGSPSYMLAug 23, 2020

Multi-kernel Passive Stochastic Gradient Algorithms and Transfer Learning

arXiv:2008.10020v25 citations
AI Analysis

This work addresses optimization challenges in machine learning for scenarios where gradient evaluations are uncontrolled, offering incremental improvements over classical methods.

The paper tackles the problem of passive stochastic gradient algorithms by developing a multi-kernel version that improves performance in high-dimensional settings and incorporates variance reduction, achieving substantial gains in numerical examples for transfer learning tasks.

This paper develops a novel passive stochastic gradient algorithm. In passive stochastic approximation, the stochastic gradient algorithm does not have control over the location where noisy gradients of the cost function are evaluated. Classical passive stochastic gradient algorithms use a kernel that approximates a Dirac delta to weigh the gradients based on how far they are evaluated from the desired point. In this paper we construct a multi-kernel passive stochastic gradient algorithm. The algorithm performs substantially better in high dimensional problems and incorporates variance reduction. We analyze the weak convergence of the multi-kernel algorithm and its rate of convergence. In numerical examples, we study the multi-kernel version of the passive least mean squares (LMS) algorithm for transfer learning to compare the performance with the classical passive version.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes