CVJun 26, 2023

A Simple and Effective Baseline for Attentional Generative Adversarial Networks

Mingyu Jin, Chong Zhang, Qinkai Yu, Haochen Xue, Xiaobo Jin, Xi Yang

arXiv:2306.14708v21.51 citationsh-index: 9Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses efficiency issues in text-to-image generation models for researchers and practitioners, but it is incremental as it builds on existing AttnGAN methods.

The authors tackled the problem of redundancy in attentional generative adversarial networks (AttnGAN) for text-to-image synthesis, resulting in SEAttnGAN, which reduces model size and improves training efficiency while maintaining performance.

Synthesising a text-to-image model of high-quality images by guiding the generative model through the Text description is an innovative and challenging task. In recent years, AttnGAN based on the Attention mechanism to guide GAN training has been proposed, SD-GAN, which adopts a self-distillation technique to improve the performance of the generator and the quality of image generation, and Stack-GAN++, which gradually improves the details and quality of the image by stacking multiple generators and discriminators. However, this series of improvements to GAN all have redundancy to a certain extent, which affects the generation performance and complexity to a certain extent. We use the popular simple and effective idea (1) to remove redundancy structure and improve the backbone network of AttnGAN. (2) to integrate and reconstruct multiple losses of DAMSM. Our improvements have significantly improved the model size and training efficiency while ensuring that the model's performance is unchanged and finally proposed our SEAttnGAN. Code is avalilable at https://github.com/jmyissb/SEAttnGAN.

View on arXiv PDF Code

Similar