Mahdi Barhoush

1paper

1 Paper

LGJul 22, 2024
Parallel Split Learning with Global Sampling

Mohammad Kohankhaki, Ahmad Ayad, Mahdi Barhoush et al.

Parallel split learning (PSL) suffers from two intertwined issues: the effective batch size grows with the number of clients, and data that is not identically and independently distributed (non-IID) skews global batches. We present parallel split learning with global sampling (GPSL), a server-driven scheme that fixes the global batch size while computing per-client batch-size schedules using pooled-level proportions. The actual samples are drawn locally without replacement by each selected client. This eliminates per-class rounding, decouples the effective batch from the client count, and makes each global batch distributionally equivalent to centralized uniform sampling without replacement. Consequently, we obtain finite-population deviation guarantees via Serfling's inequality, yielding a zero rounding bias compared to local sampling schemes. GPSL is a drop-in replacement for PSL with negligible overhead and scales to large client populations. In extensive experiments on CIFAR-10/100 and ResNet-18/34 under non-IID splits, GPSL stabilizes optimization and achieves centralized-like accuracy, while fixed local batching trails by up to 60%. Furthermore, GPSL shortens training time by avoiding inflation of training steps induced by data-depletion. These findings suggest GPSL is a promising and scalable approach for learning in resource-constrained environments.