Optimizing the Optimal Weighted Average: Efficient Distributed Sparse Classification
This work addresses communication bottlenecks in distributed machine learning for large datasets, offering an incremental improvement over existing non-interactive methods.
The paper tackled the problem of communication costs in distributed training for linear models by introducing ACOWA, a technique that uses an extra communication round to improve approximation quality with minor runtime increases. Results showed that ACOWA achieved substantially higher accuracy and more faithful solutions to the empirical risk minimizer compared to other distributed algorithms for sparse logistic regression.
While distributed training is often viewed as a solution to optimizing linear models on increasingly large datasets, inter-machine communication costs of popular distributed approaches can dominate as data dimensionality increases. Recent work on non-interactive algorithms shows that approximate solutions for linear models can be obtained efficiently with only a single round of communication among machines. However, this approximation often degenerates as the number of machines increases. In this paper, building on the recent optimal weighted average method, we introduce a new technique, ACOWA, that allows an extra round of communication to achieve noticeably better approximation quality with minor runtime increases. Results show that for sparse distributed logistic regression, ACOWA obtains solutions that are more faithful to the empirical risk minimizer and attain substantially higher accuracy than other distributed algorithms.