CVOct 18, 2023
Towards Exploring Fairness in Visual Transformer based Natural and GAN Image Detection SystemsManjary P. Gangan, Anoop Kadan, Lajish V L
Image forensics research has recently witnessed a lot of advancements towards developing computational models capable of accurately detecting natural images captured by cameras and GAN generated images. However, it is also important to ensure whether these computational models are fair enough and do not produce biased outcomes that could eventually harm certain societal groups or cause serious security threats. Exploring fairness in image forensic algorithms is an initial step towards mitigating these biases. This study explores bias in visual transformer based image forensic algorithms that classify natural and GAN images, since visual transformers are recently being widely used in image classification based tasks, including in the area of image forensics. The proposed study procures bias evaluation corpora to analyze bias in gender, racial, affective, and intersectional domains using a wide set of individual and pairwise bias evaluation measures. Since the robustness of the algorithms against image compression is an important factor to be considered in forensic tasks, this study also analyzes the impact of image compression on model bias. Hence to study the impact of image compression on model bias, a two-phase evaluation setting is followed, where the experiments are carried out in uncompressed and compressed evaluation settings. The study could identify bias existences in the visual transformer based models distinguishing natural and GAN images, and also observes that image compression impacts model biases, predominantly amplifying the presence of biases in class GAN predictions.
CVAug 14, 2023
A Robust Image Forensic Framework Utilizing Multi-Colorspace Enriched Vision Transformer for Distinguishing Natural and Computer-Generated ImagesManjary P. Gangan, Anoop Kadan, Lajish V L
The digital image forensics based research works in literature classifying natural and computer generated images primarily focuses on binary tasks. These tasks typically involve the classification of natural images versus computer graphics images only or natural images versus GAN generated images only, but not natural images versus both types of generated images simultaneously. Furthermore, despite the support of advanced convolutional neural networks and transformer based architectures that can achieve impressive classification accuracies for this forensic classification task of distinguishing natural and computer generated images, these models are seen to fail over the images that have undergone post-processing operations intended to deceive forensic algorithms, such as JPEG compression, Gaussian noise addition, etc. In this digital image forensic based work to distinguish between natural and computer-generated images encompassing both computer graphics and GAN generated images, we propose a robust forensic classifier framework leveraging enriched vision transformers. By employing a fusion approach for the networks operating in RGB and YCbCr color spaces, we achieve higher classification accuracy and robustness against the post-processing operations of JPEG compression and addition of Gaussian noise. Our approach outperforms baselines, demonstrating 94.25% test accuracy with significant performance gains in individual class accuracies. Visualizations of feature representations and attention maps reveal improved separability as well as improved information capture relevant to the forensic task. This work advances the state-of-the-art in image forensics by providing a generalized and resilient solution to distinguish between natural and generated images.
CVOct 18, 2021
Distinguishing Natural and Computer-Generated Images using Multi-Colorspace fused EfficientNetManjary P Gangan, Anoop K, Lajish V L
The problem of distinguishing natural images from photo-realistic computer-generated ones either addresses natural images versus computer graphics or natural images versus GAN images, at a time. But in a real-world image forensic scenario, it is highly essential to consider all categories of image generation, since in most cases image generation is unknown. We, for the first time, to our best knowledge, approach the problem of distinguishing natural images from photo-realistic computer-generated images as a three-class classification task classifying natural, computer graphics, and GAN images. For the task, we propose a Multi-Colorspace fused EfficientNet model by parallelly fusing three EfficientNet networks that follow transfer learning methodology where each network operates in different colorspaces, RGB, LCH, and HSV, chosen after analyzing the efficacy of various colorspace transformations in this image forensics problem. Our model outperforms the baselines in terms of accuracy, robustness towards post-processing, and generalizability towards other datasets. We conduct psychophysics experiments to understand how accurately humans can distinguish natural, computer graphics, and GAN images where we could observe that humans find difficulty in classifying these images, particularly the computer-generated images, indicating the necessity of computational algorithms for the task. We also analyze the behavior of our model through visual explanations to understand salient regions that contribute to the model's decision making and compare with manual explanations provided by human participants in the form of region markings, where we could observe similarities in both the explanations indicating the powerful nature of our model to take the decisions meaningfully.