CIMGEN: Controlled Image Manipulation by Finetuning Pretrained Generative Models on Limited Data
This addresses the need for flexible image editing tools and highlights vulnerabilities in current image forensic methods, though it is incremental as it builds on existing GANs.
The paper tackles the problem of controlled image manipulation by fine-tuning pre-trained GANs on limited data to alter images based on modified semantic maps, achieving qualitative and quantitative performance that demonstrates effectiveness in image forgery and editing, including thwarting deep learning-based forensic techniques.
Content creation and image editing can benefit from flexible user controls. A common intermediate representation for conditional image generation is a semantic map, that has information of objects present in the image. When compared to raw RGB pixels, the modification of semantic map is much easier. One can take a semantic map and easily modify the map to selectively insert, remove, or replace objects in the map. The method proposed in this paper takes in the modified semantic map and alter the original image in accordance to the modified map. The method leverages traditional pre-trained image-to-image translation GANs, such as CycleGAN or Pix2Pix GAN, that are fine-tuned on a limited dataset of reference images associated with the semantic maps. We discuss the qualitative and quantitative performance of our technique to illustrate its capacity and possible applications in the fields of image forgery and image editing. We also demonstrate the effectiveness of the proposed image forgery technique in thwarting the numerous deep learning-based image forensic techniques, highlighting the urgent need to develop robust and generalizable image forensic tools in the fight against the spread of fake media.