GAN-Supervised Dense Visual Alignment
This addresses the problem of aligning complex visual data without supervision for applications like augmented reality and image editing, representing a novel approach rather than an incremental improvement.
The paper tackles the dense visual alignment problem by proposing GAN-Supervised Learning, a framework that jointly learns discriminative models and GAN-generated training data end-to-end, resulting in a method that outperforms self-supervised algorithms and matches or exceeds supervised ones, with up to 3x improvement in precise correspondence.
We propose GAN-Supervised Learning, a framework for learning discriminative models and their GAN-generated training data jointly end-to-end. We apply our framework to the dense visual alignment problem. Inspired by the classic Congealing method, our GANgealing algorithm trains a Spatial Transformer to map random samples from a GAN trained on unaligned data to a common, jointly-learned target mode. We show results on eight datasets, all of which demonstrate our method successfully aligns complex data and discovers dense correspondences. GANgealing significantly outperforms past self-supervised correspondence algorithms and performs on-par with (and sometimes exceeds) state-of-the-art supervised correspondence algorithms on several datasets -- without making use of any correspondence supervision or data augmentation and despite being trained exclusively on GAN-generated data. For precise correspondence, we improve upon state-of-the-art supervised methods by as much as $3\times$. We show applications of our method for augmented reality, image editing and automated pre-processing of image datasets for downstream GAN training.