Attribute2Image: Conditional Image Generation from Visual Attributes
This addresses the challenge of controllable image synthesis for applications like computer vision and graphics, though it is incremental as it builds on existing generative models.
The paper tackles the problem of generating images from visual attributes by modeling images as composites of foreground and background, using a layered generative model with disentangled latent variables learned via a variational auto-encoder. It demonstrates realistic and diverse sample generation for faces and birds, with excellent results in attribute-conditioned image reconstruction and completion.
This paper investigates a novel problem of generating images from visual attributes. We model the image as a composite of foreground and background and develop a layered generative model with disentangled latent variables that can be learned end-to-end using a variational auto-encoder. We experiment with natural images of faces and birds and demonstrate that the proposed models are capable of generating realistic and diverse samples with disentangled latent representations. We use a general energy minimization algorithm for posterior inference of latent variables given novel images. Therefore, the learned generative models show excellent quantitative and visual results in the tasks of attribute-conditioned image reconstruction and completion.