How Not to Give a FLOP: Combining Regularization and Pruning for Efficient Inference
This addresses the deployment bottleneck for deep learning models in the tech industry, but it is incremental as it builds on existing regularization and pruning methods.
The paper tackled the problem of speeding up deep neural networks for efficient inference by combining regularization and pruning techniques, resulting in a substantial improvement in reducing floating-point operations (FLOPs) compared to using each technique individually.
The challenge of speeding up deep learning models during the deployment phase has been a large, expensive bottleneck in the modern tech industry. In this paper, we examine the use of both regularization and pruning for reduced computational complexity and more efficient inference in Deep Neural Networks (DNNs). In particular, we apply mixup and cutout regularizations and soft filter pruning to the ResNet architecture, focusing on minimizing floating-point operations (FLOPs). Furthermore, by using regularization in conjunction with network pruning, we show that such a combination makes a substantial improvement over each of the two techniques individually.