DCGRMay 28

RAFI -- A Ray/Work Forwarding Infrastructure for Data Parallel Multi-Node/Multi-GPU Computing

arXiv:2605.3029457.0
AI Analysis

For developers of GPU-accelerated data-parallel applications, RaFI reduces the effort to implement work migration across GPUs.

RaFI simplifies building GPU-enabled data-parallel software for migrating work items between GPUs, managing CUDA and MPI complexity. It demonstrates potential in example applications.

We present RaFI, a CUDA and MPI based software framework that simplifies the task of building GPU-enabled data-parallel software where rays or similar work items need to migrate between different GPUs. RaFI provides a simple interface for CUDA kernels to forward such work items to other GPUs, while under the hood managing all the CUDA and MPI related work required to make this happen. We describe RaFI's motivation and implementation, and show its potential in several example applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes