CVGRIVMar 19, 2022

Towards Device Efficient Conditional Image Generation

arXiv:2203.10363v23 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses the challenge of deploying photo-realistic image generation models efficiently on resource-constrained devices, though it is incremental as it builds on existing autoencoder and pruning techniques.

The paper tackles the problem of reducing tensor compute for conditional image generation autoencoders without sacrificing quality, achieving real-time performance on CPU-only devices while maintaining image quality across tasks like segmentation-to-face and face cartoonization.

We present a novel algorithm to reduce tensor compute required by a conditional image generation autoencoder without sacrificing quality of photo-realistic image generation. Our method is device agnostic, and can optimize an autoencoder for a given CPU-only, GPU compute device(s) in about normal time it takes to train an autoencoder on a generic workstation. We achieve this via a two-stage novel strategy where, first, we condense the channel weights, such that, as few as possible channels are used. Then, we prune the nearly zeroed out weight activations, and fine-tune the autoencoder. To maintain image quality, fine-tuning is done via student-teacher training, where we reuse the condensed autoencoder as the teacher. We show performance gains for various conditional image generation tasks: segmentation mask to face images, face images to cartoonization, and finally CycleGAN-based model over multiple compute devices. We perform various ablation studies to justify the claims and design choices, and achieve real-time versions of various autoencoders on CPU-only devices while maintaining image quality, thus enabling at-scale deployment of such autoencoders.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes