Semantic Map Injected GAN Training for Image-to-Image Translation
This incremental improvement addresses image-to-image translation for computer vision applications.
The paper tackles image-to-image translation by injecting semantic map training into GANs, resulting in improved generalization and better preservation of categorical information, with gains in SSIM, FID, and KID scores on CityScapes and RGB-NIR stereo datasets.
Image-to-image translation is the recent trend to transform images from one domain to another domain using generative adversarial network (GAN). The existing GAN models perform the training by only utilizing the input and output modalities of transformation. In this paper, we perform the semantic injected training of GAN models. Specifically, we train with original input and output modalities and inject a few epochs of training for translation from input to semantic map. Lets refer the original training as the training for the translation of input image into target domain. The injection of semantic training in the original training improves the generalization capability of the trained GAN model. Moreover, it also preserves the categorical information in a better way in the generated image. The semantic map is only utilized at the training time and is not required at the test time. The experiments are performed using state-of-the-art GAN models over CityScapes and RGB-NIR stereo datasets. We observe the improved performance in terms of the SSIM, FID and KID scores after injecting semantic training as compared to original training.