Bandwidth Optimal Pipeline Schedule for Collective Communication
This provides a universal solution for improving communication efficiency in distributed computing systems, though it builds heavily on existing graph theory work.
The paper tackles the problem of optimizing bandwidth for collective communication operations like allgather and reduce-scatter on any network topology, presenting a strongly polynomial-time algorithm that achieves provably the best possible bandwidth performance.
We present a strongly polynomial-time algorithm to generate bandwidth optimal allgather/reduce-scatter on any network topology, with or without switches. Our algorithm constructs pipeline schedules achieving provably the best possible bandwidth performance on a given topology. To provide a universal solution, we model the network topology as a directed graph with heterogeneous link capacities and switches directly as vertices in the graph representation. The algorithm is strongly polynomial-time with respect to the topology size. This work heavily relies on previous graph theory work on edge-disjoint spanning trees and edge splitting. While we focus on allgather, the methods in this paper can be easily extended to generate schedules for reduce, broadcast, reduce-scatter, and allreduce.