CVApr 17, 2022

DR-GAN: Distribution Regularization for Text-to-Image Generation

arXiv:2204.07945v154 citationsh-index: 48
Originality Incremental advance
AI Analysis

This work addresses text-to-image generation, a domain-specific task in computer vision, with incremental improvements.

The paper tackled the problem of generating images from text descriptions by proposing DR-GAN, which introduced modules for semantic disentangling and distribution normalization, achieving competitive performance on two public datasets.

This paper presents a new Text-to-Image generation model, named Distribution Regularization Generative Adversarial Network (DR-GAN), to generate images from text descriptions from improved distribution learning. In DR-GAN, we introduce two novel modules: a Semantic Disentangling Module (SDM) and a Distribution Normalization Module (DNM). SDM combines the spatial self-attention mechanism and a new Semantic Disentangling Loss (SDL) to help the generator distill key semantic information for the image generation. DNM uses a Variational Auto-Encoder (VAE) to normalize and denoise the image latent distribution, which can help the discriminator better distinguish synthesized images from real images. DNM also adopts a Distribution Adversarial Loss (DAL) to guide the generator to align with normalized real image distributions in the latent space. Extensive experiments on two public datasets demonstrated that our DR-GAN achieved a competitive performance in the Text-to-Image task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes