Structured Prediction using cGANs with Fusion Discriminator
This work addresses the need for flexible and consistent models in structured prediction for computer vision applications, though it appears incremental by building on existing GAN and CNN-CRF methods.
The authors tackled structured prediction tasks like image synthesis and semantic segmentation by proposing a fusion discriminator for conditional GANs, achieving improved results across multiple tasks.
We propose the fusion discriminator, a single unified framework for incorporating conditional information into a generative adversarial network (GAN) for a variety of distinct structured prediction tasks, including image synthesis, semantic segmentation, and depth estimation. Much like commonly used convolutional neural network -- conditional Markov random field (CNN-CRF) models, the proposed method is able to enforce higher-order consistency in the model, but without being limited to a very specific class of potentials. The method is conceptually simple and flexible, and our experimental results demonstrate improvement on several diverse structured prediction tasks.