CVLGMay 25, 2020

SegAttnGAN: Text to Image Generation with Segmentation Attention

arXiv:2005.12444v125 citations
Originality Incremental advance
AI Analysis

This work addresses image generation from text for applications like content creation, but it is incremental as it builds on existing GAN methods with added segmentation guidance.

The authors tackled text-to-image synthesis by incorporating segmentation information into a generative network, achieving improved realism and quantitative scores, such as an Inception Score of 4.84 on CUB and 3.52 on Oxford-102 datasets.

In this paper, we propose a novel generative network (SegAttnGAN) that utilizes additional segmentation information for the text-to-image synthesis task. As the segmentation data introduced to the model provides useful guidance on the generator training, the proposed model can generate images with better realism quality and higher quantitative measures compared with the previous state-of-art methods. We achieved Inception Score of 4.84 on the CUB dataset and 3.52 on the Oxford-102 dataset. Besides, we tested the self-attention SegAttnGAN which uses generated segmentation data instead of masks from datasets for attention and achieved similar high-quality results, suggesting that our model can be adapted for the text-to-image synthesis task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes