Adversarial Synthesis of Human Pose from Text
This work addresses the challenge of text-to-pose synthesis for applications in animation or human-computer interaction, but it is incremental as it builds on existing GAN methods.
The paper tackles the problem of generating 2D human poses from text descriptions using a conditional generative adversarial network, achieving plausible results on the COCO dataset, especially for actions with distinctive poses.
This work focuses on synthesizing human poses from human-level text descriptions. We propose a model that is based on a conditional generative adversarial network. It is designed to generate 2D human poses conditioned on human-written text descriptions. The model is trained and evaluated using the COCO dataset, which consists of images capturing complex everyday scenes with various human poses. We show through qualitative and quantitative results that the model is capable of synthesizing plausible poses matching the given text, indicating that it is possible to generate poses that are consistent with the given semantic features, especially for actions with distinctive poses.