Multi-domain Unsupervised Image-to-Image Translation with Appearance Adaptive Convolution
This addresses the problem of generating diverse image outputs across multiple domains for applications in computer vision, though it appears incremental as it builds on existing I2I translation methods.
The paper tackles the challenge of multi-domain unsupervised image-to-image translation by proposing a framework that uses appearance adaptive convolution and contrast learning to translate images into multiple target appearances while preserving geometric content, achieving visually diverse and plausible results compared to state-of-the-art methods.
Over the past few years, image-to-image (I2I) translation methods have been proposed to translate a given image into diverse outputs. Despite the impressive results, they mainly focus on the I2I translation between two domains, so the multi-domain I2I translation still remains a challenge. To address this problem, we propose a novel multi-domain unsupervised image-to-image translation (MDUIT) framework that leverages the decomposed content feature and appearance adaptive convolution to translate an image into a target appearance while preserving the given geometric content. We also exploit a contrast learning objective, which improves the disentanglement ability and effectively utilizes multi-domain image data in the training process by pairing the semantically similar images. This allows our method to learn the diverse mappings between multiple visual domains with only a single framework. We show that the proposed method produces visually diverse and plausible results in multiple domains compared to the state-of-the-art methods.