US-GAN: On the importance of Ultimate Skip Connection for Facial Expression Synthesis
This work addresses facial expression synthesis for applications like animation or human-computer interaction, offering a more efficient and effective method, though it is incremental as it builds on existing GAN architectures.
The paper tackles facial expression synthesis by introducing an ultimate skip connection in GANs to transfer identity and facial details directly, resulting in a model with 3x fewer parameters and trained on a much smaller dataset, achieving a 7% increase in face verification score and 25-58% improvements in user-study metrics.
We demonstrate the benefit of using an ultimate skip (US) connection for facial expression synthesis using generative adversarial networks (GAN). A direct connection transfers identity, facial, and color details from input to output while suppressing artifacts. The intermediate layers can therefore focus on expression generation only. This leads to a light-weight US-GAN model comprised of encoding layers, a single residual block, decoding layers, and an ultimate skip connection from input to output. US-GAN has $3\times$ fewer parameters than state-of-the-art models and is trained on $2$ orders of magnitude smaller dataset. It yields $7\%$ increase in face verification score (FVS) and $27\%$ decrease in average content distance (ACD). Based on a randomized user-study, US-GAN outperforms the state of the art by $25\%$ in face realism, $43\%$ in expression quality, and $58\%$ in identity preservation.