Adversarial Pixel-Level Generation of Semantic Images
This work addresses a specific challenge in computer vision for applications requiring precise semantic image generation, though it appears incremental as it builds on existing GAN frameworks.
The paper tackles the problem of generating semantic images with pixel-level accuracy, which is crucial for tasks like semantic segmentation, and presents a novel architecture called SemGANs that outperforms standard methods both quantitatively and qualitatively.
Generative Adversarial Networks (GANs) have obtained extraordinary success in the generation of realistic images, a domain where a lower pixel-level accuracy is acceptable. We study the problem, not yet tackled in the literature, of generating semantic images starting from a prior distribution. Intuitively this problem can be approached using standard methods and architectures. However, a better-suited approach is needed to avoid generating blurry, hallucinated and thus unusable images since tasks like semantic segmentation require pixel-level exactness. In this work, we present a novel architecture for learning to generate pixel-level accurate semantic images, namely Semantic Generative Adversarial Networks (SemGANs). The experimental evaluation shows that our architecture outperforms standard ones from both a quantitative and a qualitative point of view in many semantic image generation tasks.