Composite Data Augmentations for Synthetic Image Detection Against Real-World Perturbations
This work addresses the threat to online information integrity from AI-generated images on social media, though it is incremental as it builds on existing detection methods.
The research tackled the problem of synthetic image detection on internet-sourced images altered by compression and other operations by exploring composite data augmentations, achieving a mean average precision increase of +22.53% compared to models without augmentations.
The advent of accessible Generative AI tools enables anyone to create and spread synthetic images on social media, often with the intention to mislead, thus posing a significant threat to online information integrity. Most existing Synthetic Image Detection (SID) solutions struggle on generated images sourced from the Internet, as these are often altered by compression and other operations. To address this, our research enhances SID by exploring data augmentation combinations, leveraging a genetic algorithm for optimal augmentation selection, and introducing a dual-criteria optimization approach. These methods significantly improve model performance under real-world perturbations. Our findings provide valuable insights for developing detection models capable of identifying synthetic images across varying qualities and transformations, with the best-performing model achieving a mean average precision increase of +22.53% compared to models without augmentations. The implementation is available at github.com/efthimia145/sid-composite-data-augmentation.