CVApr 16, 2018

Training convolutional neural networks with megapixel images

arXiv:1804.05712v114 citationsHas Code
Originality Highly original
AI Analysis

This addresses a bottleneck for researchers and practitioners in computer vision who need to train models on high-resolution images but are constrained by GPU memory.

The authors tackled the problem of memory limitations in training convolutional neural networks with large images by developing a method that processes only parts of the image in memory, achieving equivalent results and enabling training with 64 megapixel images while reducing memory usage by 97%.

To train deep convolutional neural networks, the input data and the intermediate activations need to be kept in memory to calculate the gradient descent step. Given the limited memory available in the current generation accelerator cards, this limits the maximum dimensions of the input data. We demonstrate a method to train convolutional neural networks holding only parts of the image in memory while giving equivalent results. We quantitatively compare this new way of training convolutional neural networks with conventional training. In addition, as a proof of concept, we train a convolutional neural network with 64 megapixel images, which requires 97% less memory than the conventional approach.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes