Shape-conditioned Image Generation by Learning Latent Appearance Representation from Unpaired Data
This work addresses the challenge of shape-conditioned image generation for tasks like training data synthesis in computer vision, offering a novel approach that improves control over object shapes beyond category labels.
The paper tackles the problem of generating images with detailed control over object shapes, which existing methods only provide limited control over categories, by introducing SCGAN that uses a latent appearance vector and unpaired training data to generate images of arbitrary object categories with target shapes and diverse appearances, showing effectiveness through qualitative and quantitative evaluations.
Conditional image generation is effective for diverse tasks including training data synthesis for learning-based computer vision. However, despite the recent advances in generative adversarial networks (GANs), it is still a challenging task to generate images with detailed conditioning on object shapes. Existing methods for conditional image generation use category labels and/or keypoints and are only give limited control over object categories. In this work, we present SCGAN, an architecture to generate images with a desired shape specified by an input normal map. The shape-conditioned image generation task is achieved by explicitly modeling the image appearance via a latent appearance vector. The network is trained using unpaired training samples of real images and rendered normal maps. This approach enables us to generate images of arbitrary object categories with the target shape and diverse image appearances. We show the effectiveness of our method through both qualitative and quantitative evaluation on training data generation tasks.