Spatially Controllable Image Synthesis with Internal Representation Collaging
This work addresses the need for flexible image synthesis and editing tools for users in computer vision and graphics, though it appears incremental as it builds on existing GAN frameworks.
The paper tackles the problem of spatially controllable image editing by manipulating feature-space representations in trained GAN models, resulting in methods that enable semantic changes over arbitrary image regions with user-specified spatial weight maps and feature-blending techniques.
We present a novel CNN-based image editing strategy that allows the user to change the semantic information of an image over an arbitrary region by manipulating the feature-space representation of the image in a trained GAN model. We will present two variants of our strategy: (1) spatial conditional batch normalization (sCBN), a type of conditional batch normalization with user-specifiable spatial weight maps, and (2) feature-blending, a method of directly modifying the intermediate features. Our methods can be used to edit both artificial image and real image, and they both can be used together with any GAN with conditional normalization layers. We will demonstrate the power of our method through experiments on various types of GANs trained on different datasets. Code will be available at https://github.com/pfnet-research/neural-collage.