One weird trick for parallelizing convolutional neural networks
arXiv:1404.5997v21403 citations
AI Analysis
This addresses the challenge of efficient GPU utilization for deep learning practitioners, though it appears incremental as it builds on existing parallelization methods.
The paper tackles the problem of parallelizing convolutional neural network training across multiple GPUs, achieving significantly better scaling than existing alternatives.
I present a new way to parallelize the training of convolutional neural networks across multiple GPUs. The method scales significantly better than all alternatives when applied to modern convolutional neural networks.