CVMar 31, 2022

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models

arXiv:2203.16755v118 citations
Originality Incremental advance
AI Analysis

This addresses memory efficiency for researchers and practitioners training video models, offering a practical solution with minimal performance loss.

The paper tackles the problem of high GPU memory usage in training deep neural networks on videos by proposing Stochastic Backpropagation, which randomly drops backward paths to reduce activation caching, resulting in up to 80% memory saving and 10% training speedup with less than 1% accuracy drop.

We propose a memory efficient method, named Stochastic Backpropagation (SBP), for training deep neural networks on videos. It is based on the finding that gradients from incomplete execution for backpropagation can still effectively train the models with minimal accuracy loss, which attributes to the high redundancy of video. SBP keeps all forward paths but randomly and independently removes the backward paths for each network layer in each training step. It reduces the GPU memory cost by eliminating the need to cache activation values corresponding to the dropped backward paths, whose amount can be controlled by an adjustable keep-ratio. Experiments show that SBP can be applied to a wide range of models for video tasks, leading to up to 80.0% GPU memory saving and 10% training speedup with less than 1% accuracy drop on action recognition and temporal action detection.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes