StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation
This addresses scalability and robustness issues in multi-domain image translation, particularly for tasks like facial attribute transfer and expression synthesis, representing a novel advancement rather than an incremental improvement.
The paper tackles the problem of image-to-image translation for multiple domains, which previously required separate models for each pair, and proposes StarGAN, a unified model that achieves superior quality and flexibility in translating images across multiple domains using a single network.
Recent studies have shown remarkable success in image-to-image translation for two domains. However, existing approaches have limited scalability and robustness in handling more than two domains, since different models should be built independently for every pair of image domains. To address this limitation, we propose StarGAN, a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model. Such a unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network. This leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain. We empirically demonstrate the effectiveness of our approach on a facial attribute transfer and a facial expression synthesis tasks.