Implicit Feature Decoupling with Depthwise Quantization
This work addresses the challenge of efficient model compression and representation in deep learning, particularly for image feature representation, though it appears incremental as it builds on existing encoder-decoder frameworks without architectural changes.
The paper tackles the problem of increasing representation capacity in deep neural networks with efficient memory and parameter usage by proposing Depthwise Quantization (DQ), which applies quantization to decomposed sub-tensors along the feature axis, resulting in a model that uses 69% fewer parameters and achieves faster convergence while outperforming previous state-of-the-art on likelihood estimation tasks across datasets like CIFAR-10, ImageNet-32, and ImageNet-64.
Quantization has been applied to multiple domains in Deep Neural Networks (DNNs). We propose Depthwise Quantization (DQ) where $\textit{quantization}$ is applied to a decomposed sub-tensor along the $\textit{feature axis}$ of weak statistical dependence. The feature decomposition leads to an exponential increase in $\textit{representation capacity}$ with a linear increase in memory and parameter cost. In addition, DQ can be directly applied to existing encoder-decoder frameworks without modification of the DNN architecture. We use DQ in the context of Hierarchical Auto-Encoder and train end-to-end on an image feature representation. We provide an analysis on cross-correlation between spatial and channel features and we propose a decomposition of the image feature representation along the channel axis. The improved performance of the depthwise operator is due to the increased representation capacity from implicit feature decoupling. We evaluate DQ on the likelihood estimation task, where it outperforms the previous state-of-the-art on CIFAR-10, ImageNet-32 and ImageNet-64. We progressively train with increasing image size a single hierarchical model that uses 69% less parameters and has a faster convergence than the previous works.