DecomposeMe: Simplifying ConvNets for End-to-End Learning
This addresses efficiency issues for embedded vision systems, offering a novel method to reduce parameters while improving performance.
The paper tackles the problem of ConvNets being computationally demanding for embedded devices by proposing DecomposeMe, a technique using 1D convolutions to simplify networks, resulting in a 7.7% improvement in top-1 accuracy on Places2 with 92% fewer parameters compared to VGG-B.
Deep learning and convolutional neural networks (ConvNets) have been successfully applied to most relevant tasks in the computer vision community. However, these networks are computationally demanding and not suitable for embedded devices where memory and time consumption are relevant. In this paper, we propose DecomposeMe, a simple but effective technique to learn features using 1D convolutions. The proposed architecture enables both simplicity and filter sharing leading to increased learning capacity. A comprehensive set of large-scale experiments on ImageNet and Places2 demonstrates the ability of our method to improve performance while significantly reducing the number of parameters required. Notably, on Places2, we obtain an improvement in relative top-1 classification accuracy of 7.7\% with an architecture that requires 92% fewer parameters compared to VGG-B. The proposed network is also demonstrated to generalize to other tasks by converting existing networks.