LGMLSep 28, 2019

GDP: Generalized Device Placement for Dataflow Graphs

arXiv:1910.01578v143 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of automated device placement for large neural networks, which is critical for performance in heterogeneous computing environments, though it builds incrementally on existing methods.

The paper tackles the problem of efficiently placing operations in neural network dataflow graphs on devices to improve runtime and scalability, achieving on average a 16% improvement over human experts and 9.2% over prior art with 15 times faster convergence.

Runtime and scalability of large neural networks can be significantly affected by the placement of operations in their dataflow graphs on suitable devices. With increasingly complex neural network architectures and heterogeneous device characteristics, finding a reasonable placement is extremely challenging even for domain experts. Most existing automated device placement approaches are impractical due to the significant amount of compute required and their inability to generalize to new, previously held-out graphs. To address both limitations, we propose an efficient end-to-end method based on a scalable sequential attention mechanism over a graph neural network that is transferable to new graphs. On a diverse set of representative deep learning models, including Inception-v3, AmoebaNet, Transformer-XL, and WaveNet, our method on average achieves 16% improvement over human experts and 9.2% improvement over the prior art with 15 times faster convergence. To further reduce the computation cost, we pre-train the policy network on a set of dataflow graphs and use a superposition network to fine-tune it on each individual graph, achieving state-of-the-art performance on large hold-out graphs with over 50k nodes, such as an 8-layer GNMT.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes