NI LGApr 26

Multi-Plane HyperX: A Low-Latency and Cost-Effective Network for Large-Scale AI and HPC Systems

arXiv:2604.2351932.7

Predicted impact top 39% in NI · last 90 daysOriginality Incremental advance

AI Analysis

For system architects designing low-latency, cost-effective networks for large-scale AI and HPC clusters, this paper proposes a novel topology that outperforms existing state-of-the-art direct networks.

This work introduces multi-plane HyperX, a network topology for large-scale AI and HPC systems, and shows it achieves smaller network diameter and better cost-effectiveness compared to multi-plane Fat-Tree, Dragonfly, and Dragonfly+.

Multi-plane architectures have become increasingly prevalent in the Fat-Tree networks of AI data centers. By leveraging multiple ports on a single network interface card (NIC) or multiple NICs within a scale-up domain, each port or NIC is allocated to an independent network plane, thereby provisioning the overall system with multiple network planes. However, no prior literature has explored the application of multi-plane technologies to direct networks such as HyperX. This paper investigates the multi-plane HyperX network and demonstrates that, compared to state-of-the-art network topologies like multi-plane Fat-Tree, Dragonfly, and Dragonfly+, the multi-plane HyperX architecture achieves a significantly smaller network diameter and superior cost-effectiveness.

View on arXiv PDF

Similar