Semi-Supervised StyleGAN for Disentanglement Learning
This work addresses disentanglement learning for high-resolution image generation and editing, though it appears incremental as it builds on StyleGAN with semi-supervision.
The paper tackled the limitations of current disentanglement methods, such as difficulty with high-resolution images and non-identifiability, by designing a semi-supervised StyleGAN-based approach, achieving good disentanglement with only 0.25% to 2.5% labeled data on synthetic and real datasets.
Disentanglement learning is crucial for obtaining disentangled representations and controllable generation. Current disentanglement methods face several inherent limitations: difficulty with high-resolution images, primarily focusing on learning disentangled representations, and non-identifiability due to the unsupervised setting. To alleviate these limitations, we design new architectures and loss functions based on StyleGAN (Karras et al., 2019), for semi-supervised high-resolution disentanglement learning. We create two complex high-resolution synthetic datasets for systematic testing. We investigate the impact of limited supervision and find that using only 0.25%~2.5% of labeled data is sufficient for good disentanglement on both synthetic and real datasets. We propose new metrics to quantify generator controllability, and observe there may exist a crucial trade-off between disentangled representation learning and controllable generation. We also consider semantic fine-grained image editing to achieve better generalization to unseen images.