Controllable Person Image Synthesis with Attribute-Decomposed GAN
This addresses the problem of generating realistic and customizable person images for applications like virtual try-on or content creation, representing a novel method for a known bottleneck in attribute control.
The paper tackles controllable person image synthesis by proposing an Attribute-Decomposed GAN that embeds human attributes as independent latent codes, enabling flexible control via mixing and interpolation, and demonstrates superiority over state-of-the-art methods in pose transfer and effectiveness in component attribute transfer.
This paper introduces the Attribute-Decomposed GAN, a novel generative model for controllable person image synthesis, which can produce realistic person images with desired human attributes (e.g., pose, head, upper clothes and pants) provided in various source inputs. The core idea of the proposed model is to embed human attributes into the latent space as independent codes and thus achieve flexible and continuous control of attributes via mixing and interpolation operations in explicit style representations. Specifically, a new architecture consisting of two encoding pathways with style block connections is proposed to decompose the original hard mapping into multiple more accessible subtasks. In source pathway, we further extract component layouts with an off-the-shelf human parser and feed them into a shared global texture encoder for decomposed latent codes. This strategy allows for the synthesis of more realistic output images and automatic separation of un-annotated attributes. Experimental results demonstrate the proposed method's superiority over the state of the art in pose transfer and its effectiveness in the brand-new task of component attribute transfer.