CV IVAug 13, 2020

Powers of layers for image-to-image translation

Hugo Touvron, Matthijs Douze, Matthieu Cord, Hervé Jégou

arXiv:2008.05763v13.32 citations

Originality Incremental advance

AI Analysis

This addresses image-to-image translation problems for computer vision applications, offering a more parameter-efficient and modifiable approach, though it appears incremental as it builds on autoencoder and residual block concepts.

The paper tackles unpaired image-to-image translation tasks like style transfer and denoising by proposing a simple architecture that learns a residual block in latent space, iteratively applied until the target domain is reached, achieving performance comparable or better than CycleGAN with significantly fewer parameters.

We propose a simple architecture to address unpaired image-to-image translation tasks: style or class transfer, denoising, deblurring, deblocking, etc. We start from an image autoencoder architecture with fixed weights. For each task we learn a residual block operating in the latent space, which is iteratively called until the target domain is reached. A specific training schedule is required to alleviate the exponentiation effect of the iterations. At test time, it offers several advantages: the number of weight parameters is limited and the compositional design allows one to modulate the strength of the transformation with the number of iterations. This is useful, for instance, when the type or amount of noise to suppress is not known in advance. Experimentally, we provide proofs of concepts showing the interest of our method for many transformations. The performance of our model is comparable or better than CycleGAN with significantly fewer parameters.

View on arXiv PDF

Similar