Vector Quantized Image-to-Image Translation
This addresses the problem of limited flexibility in image manipulation tasks for researchers and practitioners in computer vision, though it appears incremental as it builds on existing techniques.
The paper tackles the limitation of current image-to-image translation methods, which often only learn recolorization or regional changes due to structural constraints, by proposing a unified framework using vector quantization to enable image-to-image translation, unconditional generation, and image extension, achieving comparable performance to state-of-the-art methods.
Current image-to-image translation methods formulate the task with conditional generation models, leading to learning only the recolorization or regional changes as being constrained by the rich structural information provided by the conditional contexts. In this work, we propose introducing the vector quantization technique into the image-to-image translation framework. The vector quantized content representation can facilitate not only the translation, but also the unconditional distribution shared among different domains. Meanwhile, along with the disentangled style representation, the proposed method further enables the capability of image extension with flexibility in both intra- and inter-domains. Qualitative and quantitative experiments demonstrate that our framework achieves comparable performance to the state-of-the-art image-to-image translation and image extension methods. Compared to methods for individual tasks, the proposed method, as a unified framework, unleashes applications combining image-to-image translation, unconditional generation, and image extension altogether. For example, it provides style variability for image generation and extension, and equips image-to-image translation with further extension capabilities.