Unwrapping ADMM: Efficient Distributed Computing via Transpose Reduction
This addresses the problem of slow distributed computing for large-scale model fitting, offering a more efficient alternative to consensus ADMM methods.
The paper tackles the problem of inefficient distributed model fitting by proposing iterative methods that solve global sub-problems over entire datasets using transpose reduction, avoiding expensive inner loops. This approach fits linear classifiers and sparse linear models to datasets over 5 Tb using 7000 cores in far less time than previous methods.
Recent approaches to distributed model fitting rely heavily on consensus ADMM, where each node solves small sub-problems using only local data. We propose iterative methods that solve {\em global} sub-problems over an entire distributed dataset. This is possible using transpose reduction strategies that allow a single node to solve least-squares over massive datasets without putting all the data in one place. This results in simple iterative methods that avoid the expensive inner loops required for consensus methods. To demonstrate the efficiency of this approach, we fit linear classifiers and sparse linear models to datasets over 5 Tb in size using a distributed implementation with over 7000 cores in far less time than previous approaches.