A Principled Hierarchical Deep Learning Approach to Joint Image Compression and Classification
This work addresses the problem of efficient image compression and classification for edge computing applications, representing an incremental improvement in optimizing distributed models for bandwidth-constrained environments.
The paper tackles the challenge of training distributed deep learning models for remote image classification under limited channel bandwidth by proposing a joint learning strategy that optimizes encoder latents for compactness and discriminative power, achieving accuracy improvements of up to 1.5% on CIFAR-10 and 3% on CIFAR-100 over conventional methods.
Among applications of deep learning (DL) involving low cost sensors, remote image classification involves a physical channel that separates edge sensors and cloud classifiers. Traditional DL models must be divided between an encoder for the sensor and the decoder + classifier at the edge server. An important challenge is to effectively train such distributed models when the connecting channels have limited rate/capacity. Our goal is to optimize DL models such that the encoder latent requires low channel bandwidth while still delivers feature information for high classification accuracy. This work proposes a three-step joint learning strategy to guide encoders to extract features that are compact, discriminative, and amenable to common augmentations/transformations. We optimize latent dimension through an initial screening phase before end-to-end (E2E) training. To obtain an adjustable bit rate via a single pre-deployed encoder, we apply entropy-based quantization and/or manual truncation on the latent representations. Tests show that our proposed method achieves accuracy improvement of up to 1.5% on CIFAR-10 and 3% on CIFAR-100 over conventional E2E cross-entropy training.