CV LGJun 6, 2024

ReDistill: Residual Encoded Distillation for Peak Memory Reduction of CNNs

Fang Chen, Gourav Datta, Mujahid Al Rafi, Hyeran Jeon, Meng Tang

arXiv:2406.03744v32.0Has Code

Originality Incremental advance

AI Analysis

This work addresses memory constraints for deploying computer vision models on edge devices, presenting an incremental improvement over existing distillation methods.

The paper tackles the problem of high peak memory consumption in CNNs for resource-constrained edge devices by proposing ReDistill, a residual encoded distillation method that reduces theoretical peak memory by 4x-5x for image classification and 4x for diffusion-based image generation with minimal performance degradation.

The expansion of neural network sizes and the enhanced resolution of modern image sensors result in heightened memory and power demands to process modern computer vision models. In order to deploy these models in extremely resource-constrained edge devices, it is crucial to reduce their peak memory, which is the maximum memory consumed during the execution of a model. A naive approach to reducing peak memory is aggressive down-sampling of feature maps via pooling with large stride, which often results in unacceptable degradation in network performance. To mitigate this problem, we propose residual encoded distillation (ReDistill) for peak memory reduction in a teacher-student framework, in which a student network with less memory is derived from the teacher network using aggressive pooling. We apply our distillation method to multiple problems in computer vision, including image classification and diffusion-based image generation. For image classification, our method yields 4x-5x theoretical peak memory reduction with less degradation in accuracy for most CNN-based architectures. For diffusion-based image generation, our proposed distillation method yields a denoising network with 4x lower theoretical peak memory while maintaining decent diversity and fidelity for image generation. Experiments demonstrate our method's superior performance compared to other feature-based and response-based distillation methods when applied to the same student network. The code is available at https://github.com/mengtang-lab/ReDistill.

View on arXiv PDF Code

Similar