Pushing the boundaries of parallel Deep Learning -- A practical approach
This is an incremental contribution for researchers and practitioners in parallel deep learning, focusing on practical implementation rather than theoretical breakthroughs.
This work assesses state-of-the-art data parallel deep neural network training to identify performance improvement opportunities and presents a practical C++ library design that unifies current methodologies in a performance-conscious framework.
This work aims to assess the state of the art of data parallel deep neural network training, trying to identify potential research tracks to be exploited for performance improvement. Beside, it presents a design for a practical C++ library dedicated at implementing and unifying the current state of the art methodologies for parallel training in a performance-conscious framework, allowing the user to explore novel strategies without departing significantly from its usual work-flow.