DCLGMay 12, 2020

Benchmark Tests of Convolutional Neural Network and Graph Convolutional Network on HorovodRunner Enabled Spark Clusters

arXiv:2005.05510v12 citations
Originality Synthesis-oriented
AI Analysis

This provides incremental benchmarking for smaller companies using accessible Spark clusters to compete with big tech in distributed deep learning.

The paper tackled the lack of benchmark tests for HorovodRunner on Spark clusters, showing it significantly improves scaling efficiency for CNN tasks on GPU and CPU clusters, but not for GCN tasks, and implemented the Rectified Adam optimizer in HorovodRunner for the first time.

The freedom of fast iterations of distributed deep learning tasks is crucial for smaller companies to gain competitive advantages and market shares from big tech giants. HorovodRunner brings this process to relatively accessible spark clusters. There have been, however, no benchmark tests on HorovodRunner per se, nor specifically graph convolutional network (GCN, hereafter), and very limited scalability benchmark tests on Horovod, the predecessor requiring custom built GPU clusters. For the first time, we show that Databricks' HorovodRunner achieves significant lift in scaling efficiency for the convolutional neural network (CNN, hereafter) based tasks on both GPU and CPU clusters, but not the original GCN task. We also implemented the Rectified Adam optimizer for the first time in HorovodRunner.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes