DCLGMSMar 8, 2019

Auto-Vectorizing TensorFlow Graphs: Jacobians, Auto-Batching And Beyond

arXiv:1903.04243v116 citations
Originality Incremental advance
AI Analysis

This work addresses performance bottlenecks in machine learning frameworks like TensorFlow for developers and researchers, though it appears incremental as it builds on existing high-level dataflow IR.

The authors tackled the problem of inefficient loop-based operations in TensorFlow by proposing a static loop vectorization optimization, achieving huge speedups compared to existing methods like loop-based implementations and DyNet's run-time batching.

We propose a static loop vectorization optimization on top of high level dataflow IR used by frameworks like TensorFlow. A new statically vectorized parallel-for abstraction is provided on top of TensorFlow, and used for applications ranging from auto-batching and per-example gradients, to jacobian computation, optimized map functions and input pipeline optimization. We report huge speedups compared to both loop based implementations, as well as run-time batching adopted by the DyNet framework.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes