Anton Juerss

2papers

2 Papers

28.8DCMay 26
Revisiting Bruck: Phase-Efficient All-to-All Communication in Reconfigurable Networks

Anton Juerss, Stefan Schmid

All-to-All communication is a key performance bottleneck for distributed machine learning (ML) and high-performance computing (HPC) workloads, where dense traffic increasingly stresses scale-up interconnects. While these ML and HPC workloads have driven unprecedented infrastructure demand, optical reconfigurable networks (ORNs) offer a promising path forward. By adapting the physical topology to the active workload, they improve communication cost and bandwidth utilization. However, their benefit is critically contingent on whether the collective consists of structured phases that can be served by sparse and reusable topology states. In this paper, we revisit Bruck's All-to-All implementation and demonstrate the benefits of topology optimization in which both communication pattern and reconfiguration strategy are co-designed. We present ReTri, a bidirectional All-to-All schedule for ORNs. ReTri uses balanced ternary block propagation to complete All-to-All in $\lceil \log_3 n\rceil$ phases. The induced reconfiguration strategy from ReTri's pairwise bidirectional exchanges allow reconfiguration delays to be amortized across multiple phases. Preliminary simulations show that ReTri improves completion time by up to $10\times$ over static All-to-All, even for millisecond-scale reconfiguration delays, and improving reconfigurable Bruck by up to $2.1\times$.

12.9NIMay 12
Bridge: Optimizing Collective Communication Schedules in Reconfigurable Networks with Reusable Subrings

Anton Juerss, Stefan Schmid

Optical circuit-switched networks have emerged as an appealing alternative to electrical fabrics as they can reconfigure the network topology at runtime, reducing communication cost and improving bandwidth utilization. Yet exploiting optical reconfigurable networks for collective communication comes with a fundamental trade-off: each reconfiguration incurs non-negligible delay, communication must pause while the fabric reconfigures, and the benefit of a new topology depends on future traffic. The central question is therefore when reconfiguration is worth its cost. While prior work has demonstrated the benefits of reconfiguration, existing strategies use optical links only to optimize the current step, without reusing them for future steps. In this paper, we present Bridge, a reconfiguration strategy for important collective communication primitives used in AI/ML and HPC applications, namely All-to-All, AllReduce, Reduce-Scatter, and AllGather. Bridge exploits the structure of Bruck's communication pattern to support efficient sparse reconfiguration. The key idea is to reduce propagation and transmission delay by directly connecting immediate communication partners and preserve efficient reachability to future peers through connected subrings. As a result, optical links can be reused across multiple subsequent steps, allowing the benefit of reconfiguration to amortize beyond a single step. Our evaluation shows that Bridge reduces All-to-All completion time by typically $3\times$ to $10\times$ over static baselines even with millisecond-scale reconfiguration delays. For AllReduce, Bridge uniformly outperforms existing reconfiguration strategies, delivers up to $1.5\times$ speedup, and exceeds the bandwidth-optimal Ring algorithm by $1.5\times$ to $6.6\times$ on low to moderate-sized workloads.