CVNov 25, 2022

Unifying conditional and unconditional semantic image synthesis with OCO-GAN

arXiv:2211.14105v1h-index: 26
Originality Incremental advance
AI Analysis

This addresses the need for more flexible and efficient generative models in computer vision by combining tasks that are typically studied separately, though it is incremental as it builds on existing GAN frameworks.

The paper tackles the problem of unifying conditional and unconditional semantic image synthesis by proposing OCO-GAN, which uses a shared network to generate images either from semantic maps or latents, achieving competitive or better performance than specialized models across datasets like Cityscapes and COCO-Stuff.

Generative image models have been extensively studied in recent years. In the unconditional setting, they model the marginal distribution from unlabelled images. To allow for more control, image synthesis can be conditioned on semantic segmentation maps that instruct the generator the position of objects in the image. While these two tasks are intimately related, they are generally studied in isolation. We propose OCO-GAN, for Optionally COnditioned GAN, which addresses both tasks in a unified manner, with a shared image synthesis network that can be conditioned either on semantic maps or directly on latents. Trained adversarially in an end-to-end approach with a shared discriminator, we are able to leverage the synergy between both tasks. We experiment with Cityscapes, COCO-Stuff, ADE20K datasets in a limited data, semi-supervised and full data regime and obtain excellent performance, improving over existing hybrid models that can generate both with and without conditioning in all settings. Moreover, our results are competitive or better than state-of-the art specialised unconditional and conditional models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes