Extracting Semantic Knowledge from GANs with Unsupervised Learning
This work addresses the challenge of interpreting and utilizing GANs for downstream tasks like segmentation and synthesis, offering a method to generate paired datasets without manual labeling, which is incremental as it builds on prior findings about linear separability in GANs.
The authors tackled the problem of extracting semantic knowledge from GANs by proposing KLiSH, a clustering algorithm that leverages linear separability to cluster GAN features, enabling unsupervised semantic segmentation and semantic-conditional image synthesis without human annotations.
Recently, unsupervised learning has made impressive progress on various tasks. Despite the dominance of discriminative models, increasing attention is drawn to representations learned by generative models and in particular, Generative Adversarial Networks (GANs). Previous works on the interpretation of GANs reveal that GANs encode semantics in feature maps in a linearly separable form. In this work, we further find that GAN's features can be well clustered with the linear separability assumption. We propose a novel clustering algorithm, named KLiSH, which leverages the linear separability to cluster GAN's features. KLiSH succeeds in extracting fine-grained semantics of GANs trained on datasets of various objects, e.g., car, portrait, animals, and so on. With KLiSH, we can sample images from GANs along with their segmentation masks and synthesize paired image-segmentation datasets. Using the synthesized datasets, we enable two downstream applications. First, we train semantic segmentation networks on these datasets and test them on real images, realizing unsupervised semantic segmentation. Second, we train image-to-image translation networks on the synthesized datasets, enabling semantic-conditional image synthesis without human annotations.