LGCLDCDec 11, 2017

Cavs: A Vertex-centric Programming Interface for Dynamic Neural Networks

arXiv:1712.04048v11 citations
Originality Incremental advance
AI Analysis

This addresses a performance bottleneck for researchers and practitioners working with dynamic neural networks, offering a more efficient solution, though it is incremental in improving existing methods.

The paper tackles the inefficiency of existing frameworks in expressing and executing dynamic neural networks with variable structures, such as graphs and sequences, by introducing Cavs, a vertex-centric programming interface that achieves nearly an order of magnitude speedup in training compared to state-of-the-art frameworks like TensorFlow Fold and DyNet.

Recent deep learning (DL) models have moved beyond static network architectures to dynamic ones, handling data where the network structure changes every example, such as sequences of variable lengths, trees, and graphs. Existing dataflow-based programming models for DL---both static and dynamic declaration---either cannot readily express these dynamic models, or are inefficient due to repeated dataflow graph construction and processing, and difficulties in batched execution. We present Cavs, a vertex-centric programming interface and optimized system implementation for dynamic DL models. Cavs represents dynamic network structure as a static vertex function $\mathcal{F}$ and a dynamic instance-specific graph $\mathcal{G}$, and performs backpropagation by scheduling the execution of $\mathcal{F}$ following the dependencies in $\mathcal{G}$. Cavs bypasses expensive graph construction and preprocessing overhead, allows for the use of static graph optimization techniques on pre-defined operations in $\mathcal{F}$, and naturally exposes batched execution opportunities over different graphs. Experiments comparing Cavs to two state-of-the-art frameworks for dynamic NNs (TensorFlow Fold and DyNet) demonstrate the efficacy of this approach: Cavs achieves a near one order of magnitude speedup on training of various dynamic NN architectures, and ablations demonstrate the contribution of our proposed batching and memory management strategies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes