Deep Compressed Pneumonia Detection for Low-Power Embedded Devices
This work addresses the problem of high memory and computation requirements for medical AI applications on portable devices, representing an incremental improvement in model compression techniques.
The paper tackles the challenge of deploying deep neural networks for pneumonia detection on low-power embedded devices by developing a systematic structured weight pruning method, achieving up to 36x compression with no accuracy loss.
Deep neural networks (DNNs) have been expanded into medical fields and triggered the revolution of some medical applications by extracting complex features and achieving high accuracy and performance, etc. On the contrast, the large-scale network brings high requirements of both memory storage and computation resource, especially for portable medical devices and other embedded systems. In this work, we first train a DNN for pneumonia detection using the dataset provided by RSNA Pneumonia Detection Challenge. To overcome hardware limitation for implementing large-scale networks, we develop a systematic structured weight pruning method with filter sparsity, column sparsity and combined sparsity. Experiments show that we can achieve up to 36x compression ratio compared to the original model with 106 layers, while maintaining no accuracy degradation. We evaluate the proposed methods on an embedded low-power device, Jetson TX2, and achieve low power usage and high energy efficiency.