LGMay 17, 2023

ACRoBat: Optimizing Auto-batching of Dynamic Deep Learning at Compile Time

arXiv:2305.10611v24 citations
Originality Incremental advance
AI Analysis

This addresses a bottleneck in optimizing throughput for dynamic deep learning applications like text parsing and machine translation, offering a significant performance improvement over existing methods.

The paper tackles the challenge of batching dynamic deep learning computations with control flow divergence, presenting ACRoBat, a framework that achieves up to 8.5x better performance than DyNet on an Nvidia GeForce GPU.

Dynamic control flow is an important technique often used to design expressive and efficient deep learning computations for applications such as text parsing, machine translation, exiting early out of deep models and so on. The control flow divergence resulting from dynamic control flow makes batching, an important optimization enabling high throughput and hardware utilization, difficult to perform manually. In this paper, we present ACRoBat, a framework that enables efficient automatic batching for dynamic deep learning computations by performing hybrid static+dynamic compiler optimizations and end-to-end tensor code generation. ACRoBat performs up to 8.5X better than DyNet, a state-of-the-art framework for automatic batching, on an Nvidia GeForce GPU.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes