3DQ: Compact Quantized Neural Networks for Volumetric Whole Brain Segmentation
This work addresses storage constraints in space-critical medical imaging applications, such as whole brain segmentation, by enabling compact models.
The authors tackled the problem of large model sizes in 3D fully convolutional neural networks for whole brain segmentation by proposing 3DQ, a ternary quantization method that achieves 16x model compression while maintaining performance comparable to full precision models.
Model architectures have been dramatically increasing in size, improving performance at the cost of resource requirements. In this paper we propose 3DQ, a ternary quantization method, applied for the first time to 3D Fully Convolutional Neural Networks (F-CNNs), enabling 16x model compression while maintaining performance on par with full precision models. We extensively evaluate 3DQ on two datasets for the challenging task of whole brain segmentation. Additionally, we showcase our method's ability to generalize on two common 3D architectures, namely 3D U-Net and V-Net. Outperforming a variety of baselines, the proposed method is capable of compressing large 3D models to a few MBytes, alleviating the storage needs in space critical applications.