Asynchronous Stochastic Composition Optimization with Variance Reduction
This work addresses the scalability issue in composition optimization for machine learning applications like risk management and reinforcement learning, offering an incremental improvement over existing sequential methods.
The paper tackled the challenge of scaling composition optimization for large-scale datasets by proposing two asynchronous parallel algorithms with variance reduction, achieving linear convergence rates and provable linear speedup under bounded delays.
Composition optimization has drawn a lot of attention in a wide variety of machine learning domains from risk management to reinforcement learning. Existing methods solving the composition optimization problem often work in a sequential and single-machine manner, which limits their applications in large-scale problems. To address this issue, this paper proposes two asynchronous parallel variance reduced stochastic compositional gradient (AsyVRSC) algorithms that are suitable to handle large-scale data sets. The two algorithms are AsyVRSC-Shared for the shared-memory architecture and AsyVRSC-Distributed for the master-worker architecture. The embedded variance reduction techniques enable the algorithms to achieve linear convergence rates. Furthermore, AsyVRSC-Shared and AsyVRSC-Distributed enjoy provable linear speedup, when the time delays are bounded by the data dimensionality or the sparsity ratio of the partial gradients, respectively. Extensive experiments are conducted to verify the effectiveness of the proposed algorithms.