CVAIJul 1, 2023

WavePaint: Resource-efficient Token-mixer for Self-supervised Inpainting

arXiv:2307.00407v18 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the computational inefficiency in image inpainting for applications like restoration and self-supervision, offering a more efficient alternative to transformer or CNN-based models.

The paper tackles the problem of computationally heavy image inpainting models by proposing WavePaint, a resource-efficient fully convolutional architecture that uses a 2D-discrete wavelet transform for token-mixing. It outperforms state-of-the-art models in reconstruction quality with less than half the parameters and lower training times, even surpassing GAN-based architectures on the CelebA-HQ dataset without adversarial training.

Image inpainting, which refers to the synthesis of missing regions in an image, can help restore occluded or degraded areas and also serve as a precursor task for self-supervision. The current state-of-the-art models for image inpainting are computationally heavy as they are based on transformer or CNN backbones that are trained in adversarial or diffusion settings. This paper diverges from vision transformers by using a computationally-efficient WaveMix-based fully convolutional architecture -- WavePaint. It uses a 2D-discrete wavelet transform (DWT) for spatial and multi-resolution token-mixing along with convolutional layers. The proposed model outperforms the current state-of-the-art models for image inpainting on reconstruction quality while also using less than half the parameter count and considerably lower training and evaluation times. Our model even outperforms current GAN-based architectures in CelebA-HQ dataset without using an adversarially trainable discriminator. Our work suggests that neural architectures that are modeled after natural image priors require fewer parameters and computations to achieve generalization comparable to transformers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes