MBS: Macroblock Scaling for CNN Model Reduction
This addresses the problem of deploying large CNN models in resource-constrained environments, offering an incremental improvement over existing methods.
The paper tackles CNN model size reduction by proposing the Macroblock Scaling (MBS) algorithm, which adaptively reduces macroblocks based on information redundancy, achieving reductions of up to 72.71% in models like ResNet-1202 with negligible accuracy loss.
In this paper we propose the macroblock scaling (MBS) algorithm, which can be applied to various CNN architectures to reduce their model size. MBS adaptively reduces each CNN macroblock depending on its information redundancy measured by our proposed effective flops. Empirical studies conducted with ImageNet and CIFAR-10 attest that MBS can reduce the model size of some already compact CNN models, e.g., MobileNetV2 (25.03% further reduction) and ShuffleNet (20.74%), and even ultra-deep ones such as ResNet-101 (51.67%) and ResNet-1202 (72.71%) with negligible accuracy degradation. MBS also performs better reduction at a much lower cost than the state-of-the-art optimization-based methods do. MBS's simplicity and efficiency, its flexibility to work with any CNN model, and its scalability to work with models of any depth make it an attractive choice for CNN model size reduction.