OCDCMay 8

D-PDLP: Scaling PDLP to Distributed Multi-GPU Systems

arXiv:2601.076287.0h-index: 9
Predicted impact top 31% in OC · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the computational throughput bottleneck of single-GPU PDHG solvers for industrial-scale LP problems, enabling faster solution of massive linear programs.

D-PDLP extends the PDHG algorithm to distributed multi-GPU systems for solving massive-scale linear programming problems, achieving strong scalability and high performance on benchmarks including MIPLIB and Mittelmann instances while preserving full FP64 numerical accuracy.

We present a distributed framework of the Primal-Dual Hybrid Gradient (PDHG) algorithm for solving massive-scale linear programming (LP) problems. Although PDHG-based solvers demonstrate strong performance on single-node GPU architectures, their applicability to industrial-scale instances is often limited by single-GPU computational throughput. To overcome these challenges, we propose D-PDLP, the first Distributed PDLP framework, which extends PDHG to a multi-GPU setting via a practical two-dimensional grid partitioning of the constraint matrix. To improve load balance and computational efficiency, we introduce a block-wise random permutation strategy combined with nonzero-aware matrix partitioning. By distributing the intensive computation required in PDHG iterations, the proposed framework harnesses multi-GPU parallelism to achieve substantial speedups with relatively low communication overhead. Extensive experiments on standard LP benchmarks (including MIPLIB and Mittelmann instances) as well as huge-scale real-world datasets show that our distributed implementation, built upon cuPDLPx, achieves strong scalability and high performance while preserving full FP64 numerical accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes