LGJun 20, 2025
The Hidden Cost of an Image: Quantifying the Energy Consumption of AI Image GenerationGiulia Bertazzini, Chiara Albisani, Daniele Baracchi et al.
With the growing adoption of AI image generation, in conjunction with the ever-increasing environmental resources demanded by AI, we are urged to answer a fundamental question: What is the environmental impact hidden behind each image we generate? In this research, we present a comprehensive empirical experiment designed to assess the energy consumption of AI image generation. Our experiment compares 17 state-of-the-art image generation models by considering multiple factors that could affect their energy consumption, such as model quantization, image resolution, and prompt length. Additionally, we consider established image quality metrics to study potential trade-offs between energy consumption and generated image quality. Results show that image generation models vary drastically in terms of the energy they consume, with up to a 46x difference. Image resolution affects energy consumption inconsistently, ranging from a 1.3x to 4.7x increase when doubling resolution. U-Net-based models tend to consume less than Transformer-based one. Model quantization instead results to deteriorate the energy efficiency of most models, while prompt length and content have no statistically significant impact. Improving image quality does not always come at the cost of a higher energy consumption, with some of the models producing the highest quality images also being among the most energy efficient ones.
CVAug 12, 2025
Bridging the Gap: A Framework for Real-World Video Deepfake Detection via Social Network Compression EmulationAndrea Montibeller, Dasara Shullani, Daniele Baracchi et al.
The growing presence of AI-generated videos on social networks poses new challenges for deepfake detection, as detectors trained under controlled conditions often fail to generalize to real-world scenarios. A key factor behind this gap is the aggressive, proprietary compression applied by platforms like YouTube and Facebook, which launder low-level forensic cues. However, replicating these transformations at scale is difficult due to API limitations and data-sharing constraints. For these reasons, we propose a first framework that emulates the video sharing pipelines of social networks by estimating compression and resizing parameters from a small set of uploaded videos. These parameters enable a local emulator capable of reproducing platform-specific artifacts on large datasets without direct API access. Experiments on FaceForensics++ videos shared via social networks demonstrate that our emulated data closely matches the degradation patterns of real uploads. Furthermore, detectors fine-tuned on emulated videos achieve comparable performance to those trained on actual shared media. Our approach offers a scalable and practical solution for bridging the gap between lab-based training and real-world deployment of deepfake detectors, particularly in the underexplored domain of compressed video content.
MMJan 26, 2021
Efficient video integrity analysis through container characterizationPengpeng Yang, Daniele Baracchi, Massimo Iuliani et al.
Most video forensic techniques look for traces within the data stream that are, however, mostly ineffective when dealing with strongly compressed or low resolution videos. Recent research highlighted that useful forensic traces are also left in the video container structure, thus offering the opportunity to understand the life-cycle of a video file without looking at the media stream itself. In this paper we introduce a container-based method to identify the software used to perform a video manipulation and, in most cases, the operating system of the source device. As opposed to the state of the art, the proposed method is both efficient and effective and can also provide a simple explanation for its decisions. This is achieved by using a decision-tree-based classifier applied to a vectorial representation of the video container structure. We conducted an extensive validation on a dataset of 7000 video files including both software manipulated contents (ffmpeg, Exiftool, Adobe Premiere, Avidemux, and Kdenlive), and videos exchanged through social media platforms (Facebook, TikTok, Weibo and YouTube). This dataset has been made available to the research community. The proposed method achieves an accuracy of 97.6% in distinguishing pristine from tampered videos and classifying the editing software, even when the video is cut without re-encoding or when it is downscaled to the size of a thumbnail. Furthermore, it is capable of correctly identifying the operating system of the source device for most of the tampered videos.
MMMay 4, 2017
A Hybrid Approach to Video Source IdentificationMassimo Iuliani, Marco Fontani, Dasara Shullani et al.
Multimedia Forensics allows to determine whether videos or images have been captured with the same device, and thus, eventually, by the same person. Currently, the most promising technology to achieve this task, exploits the unique traces left by the camera sensor into the visual content. Anyway, image and video source identification are still treated separately from one another. This approach is limited and anachronistic if we consider that most of the visual media are today acquired using smartphones, that capture both images and videos. In this paper we overcome this limitation by exploring a new approach that allows to synergistically exploit images and videos to study the device from which they both come. Indeed, we prove it is possible to identify the source of a digital video by exploiting a reference sensor pattern noise generated from still images taken by the same device of the query video. The proposed method provides comparable or even better performance, when compared to the current video identification strategies, where a reference pattern is estimated from video frames. We also show how this strategy can be effective even in case of in-camera digitally stabilized videos, where a non-stabilized reference is not available, by solving some state-of-the-art limitations. We explore a possible direct application of this result, that is social media profile linking, i.e. discovering relationships between two or more social media profiles by comparing the visual contents - images or videos - shared therein.