Distributed Averaging Methods for Randomized Second Order Optimization
This addresses distributed optimization for heterogeneous computing systems with varying worker resources, offering an incremental improvement by focusing on bias reduction in existing methods.
The paper tackles distributed optimization problems where Hessian computation is expensive and communication is a bottleneck by developing unbiased parameter averaging methods for randomized second order optimization using Hessian sampling and sketching. The result includes closed-form formulas for regularization parameters and step sizes that provably minimize bias, with large-scale experiments on a serverless computing platform demonstrating the approach.
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a significant bottleneck. We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian. Existing works do not take the bias of the estimators into consideration, which limits their application to massively parallel computation. We provide closed-form formulas for regularization parameters and step sizes that provably minimize the bias for sketched Newton directions. We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems with varying worker resources. Additionally, we demonstrate the implications of our theoretical findings via large scale experiments performed on a serverless computing platform.