Spectral Normalization and Dual Contrastive Regularization for Image-to-Image Translation
This work improves image generation quality for computer vision applications, but it is incremental as it builds on existing contrastive learning and GAN approaches.
The paper tackles the problem of unpaired image-to-image translation by addressing the neglect of global structure constraints in existing methods, proposing a framework that achieves state-of-the-art performance in multiple tasks.
Existing image-to-image (I2I) translation methods achieve state-of-the-art performance by incorporating the patch-wise contrastive learning into Generative Adversarial Networks. However, patch-wise contrastive learning only focuses on the local content similarity but neglects the global structure constraint, which affects the quality of the generated images. In this paper, we propose a new unpaired I2I translation framework based on dual contrastive regularization and spectral normalization, namely SN-DCR. To maintain consistency of the global structure and texture, we design the dual contrastive regularization using different deep feature spaces respectively. In order to improve the global structure information of the generated images, we formulate a semantic contrastive loss to make the global semantic structure of the generated images similar to the real images from the target domain in the semantic feature space. We use Gram Matrices to extract the style of texture from images. Similarly, we design a style contrastive loss to improve the global texture information of the generated images. Moreover, to enhance the stability of the model, we employ the spectral normalized convolutional network in the design of our generator. We conduct comprehensive experiments to evaluate the effectiveness of SN-DCR, and the results prove that our method achieves SOTA in multiple tasks.