Low-memory convolutional neural networks through incremental depth-first processing
This addresses memory constraints for embedded systems, but it is incremental as it modifies existing CNN processing without fundamentally changing the architecture.
The paper tackled the problem of high memory usage in convolutional neural network inference for embedded applications by introducing an incremental depth-first processing scheme, resulting in constant memory for 1D input and memory proportional to the square root of input dimension for 2D input.
We introduce an incremental processing scheme for convolutional neural network (CNN) inference, targeted at embedded applications with limited memory budgets. Instead of processing layers one by one, individual input pixels are propagated through all parts of the network they can influence under the given structural constraints. This depth-first updating scheme comes with hard bounds on the memory footprint: the memory required is constant in the case of 1D input and proportional to the square root of the input dimension in the case of 2D input.