CVApr 11, 2019

FTGAN: A Fully-trained Generative Adversarial Networks for Text to Face Generation

Xiang Chen, Lingbo Qing, Xiaohai He, Xiaodong Luo, Yining Xu

arXiv:1904.05729v17.142 citations

Originality Incremental advance

AI Analysis

It addresses text-to-face synthesis for public safety applications, but is incremental as it builds on existing text-to-image methods with a new dataset.

The paper tackles text-to-face generation by proposing FTGAN, a fully-trained GAN that simultaneously trains text encoder and image decoder, achieving a 59% similarity to ground-truth on a new dataset and boosting Inception Score to 4.63 on CUB.

As a sub-domain of text-to-image synthesis, text-to-face generation has huge potentials in public safety domain. With lack of dataset, there are almost no related research focusing on text-to-face synthesis. In this paper, we propose a fully-trained Generative Adversarial Network (FTGAN) that trains the text encoder and image decoder at the same time for fine-grained text-to-face generation. With a novel fully-trained generative network, FTGAN can synthesize higher-quality images and urge the outputs of the FTGAN are more relevant to the input sentences. In addition, we build a dataset called SCU-Text2face for text-to-face synthesis. Through extensive experiments, the FTGAN shows its superiority in boosting both generated images' quality and similarity to the input descriptions. The proposed FTGAN outperforms the previous state of the art, boosting the best reported Inception Score to 4.63 on the CUB dataset. On SCU-text2face, the face images generated by our proposed FTGAN just based on the input descriptions is of average 59% similarity to the ground-truth, which set a baseline for text-to-face synthesis.

View on arXiv PDF

Similar