OCDSLGSTMLAug 24, 2020

Stochastic Multi-level Composition Optimization Algorithms with Level-Independent Convergence Rates

arXiv:2008.10526v540 citations
Originality Incremental advance
AI Analysis

This work addresses optimization challenges in machine learning and AI for scenarios involving nested compositions, offering an online algorithm that matches single-level complexity, which is incremental but practically useful.

The paper tackles stochastic multi-level composition optimization problems by proposing two algorithms with moving-average stochastic estimates, achieving sample complexities of O(1/ε^6) and improved O(1/ε^4) for convergence to an ε-stationary point.

In this paper, we study smooth stochastic multi-level composition optimization problems, where the objective function is a nested composition of $T$ functions. We assume access to noisy evaluations of the functions and their gradients, through a stochastic first-order oracle. For solving this class of problems, we propose two algorithms using moving-average stochastic estimates, and analyze their convergence to an $ε$-stationary point of the problem. We show that the first algorithm, which is a generalization of \cite{GhaRuswan20} to the $T$ level case, can achieve a sample complexity of $\mathcal{O}(1/ε^6)$ by using mini-batches of samples in each iteration. By modifying this algorithm using linearized stochastic estimates of the function values, we improve the sample complexity to $\mathcal{O}(1/ε^4)$. {\color{black}This modification not only removes the requirement of having a mini-batch of samples in each iteration, but also makes the algorithm parameter-free and easy to implement}. To the best of our knowledge, this is the first time that such an online algorithm designed for the (un)constrained multi-level setting, obtains the same sample complexity of the smooth single-level setting, under standard assumptions (unbiasedness and boundedness of the second moments) on the stochastic first-order oracle.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes