Combining Attention with Flow for Person Image Synthesis
This work addresses image synthesis for person re-identification or virtual try-on, but it is incremental as it builds on existing spatial transformation methods.
The paper tackled pose-guided person image synthesis by combining attention and flow-based operations to generate accurate target structures and realistic source textures, demonstrating superiority in objective and subjective experiments.
Pose-guided person image synthesis aims to synthesize person images by transforming reference images into target poses. In this paper, we observe that the commonly used spatial transformation blocks have complementary advantages. We propose a novel model by combining the attention operation with the flow-based operation. Our model not only takes the advantage of the attention operation to generate accurate target structures but also uses the flow-based operation to sample realistic source textures. Both objective and subjective experiments demonstrate the superiority of our model. Meanwhile, comprehensive ablation studies verify our hypotheses and show the efficacy of the proposed modules. Besides, additional experiments on the portrait image editing task demonstrate the versatility of the proposed combination.