RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval
This work addresses controlled image generation for applications like content creation and image editing, presenting an incremental improvement over existing methods.
The paper tackles image generation from scene descriptions by using a differentiable patch retrieval module to incorporate retrieved patches as references, resulting in realistic and diverse images with improved patch compatibility.
Image generation from scene description is a cornerstone technique for the controlled generation, which is beneficial to applications such as content creation and image editing. In this work, we aim to synthesize images from scene description with retrieved patches as reference. We propose a differentiable retrieval module. With the differentiable retrieval module, we can (1) make the entire pipeline end-to-end trainable, enabling the learning of better feature embedding for retrieval; (2) encourage the selection of mutually compatible patches with additional objective functions. We conduct extensive quantitative and qualitative experiments to demonstrate that the proposed method can generate realistic and diverse images, where the retrieved patches are reasonable and mutually compatible.