Semantically Robust Unpaired Image Translation for Data with Unmatched Semantics Statistics
This work solves the problem of semantic flipping in unpaired image-to-image translation, which is crucial for applications requiring semantic content preservation.
The paper addresses the issue of semantic flipping in unpaired image-to-image translation, where existing GAN-based methods fail to preserve input semantics due to unmatched semantic distributions. By enforcing semantic robustness through a robustness loss optimized against multi-scale feature space perturbations, their method effectively reduces semantic flipping and achieves superior quantitative and qualitative translation results.
Many applications of unpaired image-to-image translation require the input contents to be preserved semantically during translations. Unaware of the inherently unmatched semantics distributions between source and target domains, existing distribution matching methods (i.e., GAN-based) can give undesired solutions. In particular, although producing visually reasonable outputs, the learned models usually flip the semantics of the inputs. To tackle this without using extra supervision, we propose to enforce the translated outputs to be semantically invariant w.r.t. small perceptual variations of the inputs, a property we call "semantic robustness". By optimizing a robustness loss w.r.t. multi-scale feature space perturbations of the inputs, our method effectively reduces semantics flipping and produces translations that outperform existing methods both quantitatively and qualitatively.