Mauro Barni

h-index59

45papers

1,806citations

Novelty47%

AI Score35

Ranked #109,304 of 194,257 authors (top 56%)#36,528 in CV (top 62%)

45 Papers

20.1CVSep 7, 2022

Supervised GAN Watermarking for Intellectual Property Protection

Jianwei Fei, Zhihua Xia, Benedetta Tondi et al.

We propose a watermarking method for protecting the Intellectual Property (IP) of Generative Adversarial Networks (GANs). The aim is to watermark the GAN model so that any image generated by the GAN contains an invisible watermark (signature), whose presence inside the image can be checked at a later stage for ownership verification. To achieve this goal, a pre-trained CNN watermarking decoding block is inserted at the output of the generator. The generator loss is then modified by including a watermark loss term, to ensure that the prescribed watermark can be extracted from the generated images. The watermark is embedded via fine-tuning, with reduced time complexity. Results show that our method can effectively embed an invisible watermark inside the generated images. Moreover, our method is a general one and can work with different GAN architectures, different tasks, and different resolutions of the output image. We also demonstrate the good robustness performance of the embedded watermark against several post-processing, among them, JPEG compression, noise addition, blurring, and color transformations.

5.7CRSep 21, 2023

Information Forensics and Security: A quarter-century-long journey

Mauro Barni, Patrizio Campisi, Edward J. Delp et al.

Information Forensics and Security (IFS) is an active R&D area whose goal is to ensure that people use devices, data, and intellectual properties for authorized purposes and to facilitate the gathering of solid evidence to hold perpetrators accountable. For over a quarter century since the 1990s, the IFS research area has grown tremendously to address the societal needs of the digital information era. The IEEE Signal Processing Society (SPS) has emerged as an important hub and leader in this area, and the article below celebrates some landmark technical contributions. In particular, we highlight the major technological advances on some selected focus areas in the field developed in the last 25 years from the research community and present future trends.

5.9CVJan 11, 2023Code

Universal Detection of Backdoor Attacks via Density-based Clustering and Centroids Analysis

Wei Guo, Benedetta Tondi, Mauro Barni

We propose a Universal Defence against backdoor attacks based on Clustering and Centroids Analysis (CCA-UD). The goal of the defence is to reveal whether a Deep Neural Network model is subject to a backdoor attack by inspecting the training dataset. CCA-UD first clusters the samples of the training set by means of density-based clustering. Then, it applies a novel strategy to detect the presence of poisoned clusters. The proposed strategy is based on a general misclassification behaviour observed when the features of a representative example of the analysed cluster are added to benign samples. The capability of inducing a misclassification error is a general characteristic of poisoned samples, hence the proposed defence is attack-agnostic. This marks a significant difference with respect to existing defences, that, either can defend against only some types of backdoor attacks, or are effective only when some conditions on the poisoning ratio or the kind of triggering signal used by the attacker are satisfied. Experiments carried out on several classification tasks and network architectures, considering different types of backdoor attacks (with either clean or corrupted labels), and triggering signals, including both global and local triggering signals, as well as sample-specific and source-specific triggers, reveal that the proposed method is very effective to defend against backdoor attacks in all the cases, always outperforming the state of the art techniques.

9.4CVJun 2, 2022

A temporal chrominance trigger for clean-label backdoor attack against anti-spoof rebroadcast detection

Wei Guo, Benedetta Tondi, Mauro Barni

We propose a stealthy clean-label video backdoor attack against Deep Learning (DL)-based models aiming at detecting a particular class of spoofing attacks, namely video rebroadcast attacks. The injected backdoor does not affect spoofing detection in normal conditions, but induces a misclassification in the presence of a specific triggering signal. The proposed backdoor relies on a temporal trigger altering the average chrominance of the video sequence. The backdoor signal is designed by taking into account the peculiarities of the Human Visual System (HVS) to reduce the visibility of the trigger, thus increasing the stealthiness of the backdoor. To force the network to look at the presence of the trigger in the challenging clean-label scenario, we choose the poisoned samples used for the injection of the backdoor following a so-called Outlier Poisoning Strategy (OPS). According to OPS, the triggering signal is inserted in the training samples that the network finds more difficult to classify. The effectiveness of the proposed backdoor attack and its generality are validated experimentally on different datasets and anti-spoofing rebroadcast detection architectures.

5.7CVSep 19, 2022

An Overview on the Generation and Detection of Synthetic and Manipulated Satellite Images

Lydia Abady, Edoardo Daniele Cannas, Paolo Bestagini et al.

Due to the reduction of technological costs and the increase of satellites launches, satellite images are becoming more popular and easier to obtain. Besides serving benevolent purposes, satellite data can also be used for malicious reasons such as misinformation. As a matter of fact, satellite images can be easily manipulated relying on general image editing tools. Moreover, with the surge of Deep Neural Networks (DNNs) that can generate realistic synthetic imagery belonging to various domains, additional threats related to the diffusion of synthetically generated satellite images are emerging. In this paper, we review the State of the Art (SOTA) on the generation and manipulation of satellite images. In particular, we focus on both the generation of synthetic satellite imagery from scratch, and the semantic manipulation of satellite images by means of image-transfer technologies, including the transformation of images obtained from one type of sensor to another one. We also describe forensic detection techniques that have been researched so far to classify and detect synthetic image forgeries. While we focus mostly on forensic techniques explicitly tailored to the detection of AI-generated synthetic contents, we also review some methods designed for general splicing detection, which can in principle also be used to spot AI manipulate images

1.2SYJul 14, 2014

Source Distinguishability under Distortion-Limited Attack: an Optimal Transport Perspective

Mauro Barni, Benedetta Tondi

We analyze the distinguishability of two sources in a Neyman-Pearson set-up when an attacker is allowed to modify the output of one of the two sources subject to a distortion constraint. By casting the problem in a game-theoretic framework and by exploiting the parallelism between the attacker's goal and Optimal Transport Theory, we introduce the concept of Security Margin defined as the maximum average per-sample distortion introduced by the attacker for which the two sources can be distinguished ensuring arbitrarily small, yet positive, error exponents for type I and type II error probabilities. Several versions of the problem are considered according to the available knowledge about the sources and the type of distance used to define the distortion constraint. We compute the security margin for some classes of sources and derive a general upper bound assuming that the distortion is measured in terms of the mean square error between the original and the attacked sequence.

7.6CVApr 11, 2023

Open Set Classification of GAN-based Image Manipulations via a ViT-based Hybrid Architecture

Jun Wang, Omran Alamayreh, Benedetta Tondi et al.

Classification of AI-manipulated content is receiving great attention, for distinguishing different types of manipulations. Most of the methods developed so far fail in the open-set scenario, that is when the algorithm used for the manipulation is not represented by the training set. In this paper, we focus on the classification of synthetic face generation and manipulation in open-set scenarios, and propose a method for classification with a rejection option. The proposed method combines the use of Vision Transformers (ViT) with a hybrid approach for simultaneous classification and localization. Feature map correlation is exploited by the ViT module, while a localization branch is employed as an attention mechanism to force the model to learn per-class discriminative features associated with the forgery when the manipulation is performed locally in the image. Rejection is performed by considering several strategies and analyzing the model output layers. The effectiveness of the proposed method is assessed for the task of classification of facial attribute editing and GAN attribution.

5.7CVAug 23, 2022Code

Robust and Large-Payload DNN Watermarking via Fixed, Distribution-Optimized, Weights

Benedetta Tondi, Andrea Costanzo, Mauro Barni

The design of an effective multi-bit watermarking algorithm hinges upon finding a good trade-off between the three fundamental requirements forming the watermarking trade-off triangle, namely, robustness against network modifications, payload, and unobtrusiveness, ensuring minimal impact on the performance of the watermarked network. In this paper, we first revisit the nature of the watermarking trade-off triangle for the DNN case, then we exploit our findings to propose a white-box, multi-bit watermarking method achieving very large payload and strong robustness against network modification. In the proposed system, the weights hosting the watermark are set prior to training, making sure that their amplitude is large enough to bear the target payload and survive network modifications, notably retraining, and are left unchanged throughout the training process. The distribution of the weights carrying the watermark is theoretically optimised to ensure the secrecy of the watermark and make sure that the watermarked weights are indistinguishable from the non-watermarked ones. The proposed method can achieve outstanding performance, with no significant impact on network accuracy, including robustness against network modifications, retraining and transfer learning, while ensuring a payload which is out of reach of state of the art methods achieving a lower - or at most comparable - robustness.

1.2SYFeb 27, 2017

A Message Passing Approach for Decision Fusion in Adversarial Multi-Sensor Networks

Andrea Abrardo, Mauro Barni, Kassem Kallas et al.

We consider a simple, yet widely studied, set-up in which a Fusion Center (FC) is asked to make a binary decision about a sequence of system states by relying on the possibly corrupted decisions provided by byzantine nodes, i.e. nodes which deliberately alter the result of the local decision to induce an error at the fusion center. When independent states are considered, the optimum fusion rule over a batch of observations has already been derived, however its complexity prevents its use in conjunction with large observation windows. In this paper, we propose a near-optimal algorithm based on message passing that greatly reduces the computational burden of the optimum fusion rule. In addition, the proposed algorithm retains very good performance also in the case of dependent system states. By first focusing on the case of small observation windows, we use numerical simulations to show that the proposed scheme introduces a negligible increase of the decision error probability compared to the optimum fusion rule. We then analyse the performance of the new scheme when the FC make its decision by relying on long observation windows. We do so by considering both the case of independent and Markovian system states and show that the obtained performance are superior to those obtained with prior suboptimal schemes. As an additional result, we confirm the previous finding that, in some cases, it is preferable for the byzantine nodes to minimise the mutual information between the sequence system states and the reports submitted to the FC, rather than always flipping the local decision.

3.7CVSep 2, 2022Code

Which country is this picture from? New data and methods for DNN-based country recognition

Omran Alamayreh, Giovanna Maria Dimitri, Jun Wang et al.

Recognizing the country where a picture has been taken has many potential applications, such as identification of fake news and prevention of disinformation campaigns. Previous works focused on the estimation of the geo-coordinates where a picture has been taken. Yet, recognizing in which country an image was taken could be more critical, from a semantic and forensic point of view, than estimating its spatial coordinates. In the above framework, this paper provides two contributions. First, we introduce the VIPPGeo dataset, containing 3.8 million geo-tagged images. Secondly, we used the dataset to train a model casting the country recognition problem as a classification problem. The experiments show that our model provides better results than the current state of the art. Notably, we found that asking the network to identify the country provides better results than estimating the geo-coordinates and then tracing them back to the country where the picture was taken.

2.6CVMay 14, 2022

An Architecture for the detection of GAN-generated Flood Images with Localization Capabilities

Jun Wang, Omran Alamayreh, Benedetta Tondi et al.

In this paper, we address a new image forensics task, namely the detection of fake flood images generated by ClimateGAN architecture. We do so by proposing a hybrid deep learning architecture including both a detection and a localization branch, the latter being devoted to the identification of the image regions manipulated by ClimateGAN. Even if our goal is the detection of fake flood images, in fact, we found that adding a localization branch helps the network to focus on the most relevant image regions with significant improvements in terms of generalization capabilities and robustness against image processing operations. The good performance of the proposed architecture is validated on two datasets of pristine flood images downloaded from the internet and three datasets of fake flood images generated by ClimateGAN starting from a large set of diverse street images.

8.4CVOct 25, 2023

Wide Flat Minimum Watermarking for Robust Ownership Verification of GANs

Jianwei Fei, Zhihua Xia, Benedetta Tondi et al.

We propose a novel multi-bit box-free watermarking method for the protection of Intellectual Property Rights (IPR) of GANs with improved robustness against white-box attacks like fine-tuning, pruning, quantization, and surrogate model attacks. The watermark is embedded by adding an extra watermarking loss term during GAN training, ensuring that the images generated by the GAN contain an invisible watermark that can be retrieved by a pre-trained watermark decoder. In order to improve the robustness against white-box model-level attacks, we make sure that the model converges to a wide flat minimum of the watermarking loss term, in such a way that any modification of the model parameters does not erase the watermark. To do so, we add random noise vectors to the parameters of the generator and require that the watermarking loss term is as invariant as possible with respect to the presence of noise. This procedure forces the generator to converge to a wide flat minimum of the watermarking loss. The proposed method is architectureand dataset-agnostic, thus being applicable to many different generation tasks and models, as well as to CNN-based image processing architectures. We present the results of extensive experiments showing that the presence of the watermark has a negligible impact on the quality of the generated images, and proving the superior robustness of the watermark against model modification and surrogate model attacks.

1.4CVMay 16, 2022

Fusing Multiscale Texture and Residual Descriptors for Multilevel 2D Barcode Rebroadcasting Detection

Anselmo Ferreira, Changcheng Chen, Mauro Barni

Nowadays, 2D barcodes have been widely used for advertisement, mobile payment, and product authentication. However, in applications related to product authentication, an authentic 2D barcode can be illegally copied and attached to a counterfeited product in such a way to bypass the authentication scheme. In this paper, we employ a proprietary 2D barcode pattern and use multimedia forensics methods to analyse the scanning and printing artefacts resulting from the copy (rebroadcasting) attack. A diverse and complementary feature set is proposed to quantify the barcode texture distortions introduced during the illegal copying process. The proposed features are composed of global and local descriptors, which characterize the multi-scale texture appearance and the points of interest distribution, respectively. The proposed descriptors are compared against some existing texture descriptors and deep learning-based approaches under various scenarios, such as cross-datasets and cross-size. Experimental results highlight the practicality of the proposed method in real-world settings.

7.6CVNov 9, 2023

Robust Retraining-free GAN Fingerprinting via Personalized Normalization

Jianwei Fei, Zhihua Xia, Benedetta Tondi et al.

In recent years, there has been significant growth in the commercial applications of generative models, licensed and distributed by model developers to users, who in turn use them to offer services. In this scenario, there is a need to track and identify the responsible user in the presence of a violation of the license agreement or any kind of malicious usage. Although there are methods enabling Generative Adversarial Networks (GANs) to include invisible watermarks in the images they produce, generating a model with a different watermark, referred to as a fingerprint, for each user is time- and resource-consuming due to the need to retrain the model to include the desired fingerprint. In this paper, we propose a retraining-free GAN fingerprinting method that allows model developers to easily generate model copies with the same functionality but different fingerprints. The generator is modified by inserting additional Personalized Normalization (PN) layers whose parameters (scaling and bias) are generated by two dedicated shallow networks (ParamGen Nets) taking the fingerprint as input. A watermark decoder is trained simultaneously to extract the fingerprint from the generated images. The proposed method can embed different fingerprints inside the GAN by just changing the input of the ParamGen Nets and performing a feedforward pass, without finetuning or retraining. The performance of the proposed method in terms of robustness against both model-level and image-level attacks is also superior to the state-of-the-art.

7.6CVJul 19, 2023Code

A Siamese-based Verification System for Open-set Architecture Attribution of Synthetic Images

Lydia Abady, Jun Wang, Benedetta Tondi et al.

Despite the wide variety of methods developed for synthetic image attribution, most of them can only attribute images generated by models or architectures included in the training set and do not work with unknown architectures, hindering their applicability in real-world scenarios. In this paper, we propose a verification framework that relies on a Siamese Network to address the problem of open-set attribution of synthetic images to the architecture that generated them. We consider two different settings. In the first setting, the system determines whether two images have been produced by the same generative architecture or not. In the second setting, the system verifies a claim about the architecture used to generate a synthetic image, utilizing one or multiple reference images generated by the claimed architecture. The main strength of the proposed system is its ability to operate in both closed and open-set scenarios so that the input images, either the query and reference images, can belong to the architectures considered during training or not. Experimental evaluations encompassing various generative architectures such as GANs, diffusion models, and transformers, focusing on synthetic face image generation, confirm the excellent performance of our method in both closed and open-set settings, as well as its strong generalization capabilities.

5.9MMApr 28, 2025Code

WILD: a new in-the-Wild Image Linkage Dataset for synthetic image attribution

Pietro Bongini, Sara Mandelli, Andrea Montibeller et al.

Synthetic image source attribution is an open challenge, with an increasing number of image generators being released yearly. The complexity and the sheer number of available generative techniques, as well as the scarcity of high-quality open source datasets of diverse nature for this task, make training and benchmarking synthetic image source attribution models very challenging. WILD is a new in-the-Wild Image Linkage Dataset designed to provide a powerful training and benchmarking tool for synthetic image attribution models. The dataset is built out of a closed set of 10 popular commercial generators, which constitutes the training base of attribution models, and an open set of 10 additional generators, simulating a real-world in-the-wild scenario. Each generator is represented by 1,000 images, for a total of 10,000 images in the closed set and 10,000 images in the open set. Half of the images are post-processed with a wide range of operators. WILD allows benchmarking attribution models in a wide range of tasks, including closed and open set identification and verification, and robust attribution with respect to post-processing and adversarial attacks. Models trained on WILD are expected to benefit from the challenging scenario represented by the dataset itself. Moreover, an assessment of seven baseline methodologies on closed and open set attribution is presented, including robustness tests with respect to post-processing.

2.0CVOct 26, 2024Code

An Efficient Watermarking Method for Latent Diffusion Models via Low-Rank Adaptation and Dynamic Loss Weighting

Dongdong Lin, Yue Li, Benedetta Tondi et al.

The rapid proliferation of Deep Neural Networks (DNNs) is driving a surge in model watermarking technologies, as the trained models themselves constitute valuable intellectual property. Existing watermarking approaches primarily focus on modifying model parameters or altering sampling behaviors. However, with the emergence of increasingly large models, improving the efficiency of watermark embedding becomes essential to manage increasing computational demands. Prioritizing efficiency not only optimizes resource utilization, making the watermarking process more applicable for large models, but also mitigates potential degradation of model performance. In this paper, we propose an efficient watermarking method for Latent Diffusion Models (LDMs) based on Low-Rank Adaptation (LoRA). The core idea is to introduce trainable low-rank parameters into the frozen LDM to embed watermark, thereby preserving the integrity of the original model weights. Furthermore, a dynamic loss weight scheduler is designed to adaptively balance the objectives of generative quality and watermark fidelity, enabling the model to achieve effective watermark embedding with minimal impact on quality of the generated images. Experimental results show that the proposed method ensures fast and accurate watermark embedding and a high quality of the generated images, at the same time maintaining a level of robustness aligned - in some cases superior - with state-of-the-art approaches. Moreover, the method generalizes well across different datasets and base LDMs. Codes are available at: https://github.com/MrDongdongLin/EW-LoRA.

7.6CVMay 19, 2024

BOSC: A Backdoor-based Framework for Open Set Synthetic Image Attribution

Jun Wang, Benedetta Tondi, Mauro Barni

Synthetic image attribution addresses the problem of tracing back the origin of images produced by generative models. Extensive efforts have been made to explore unique representations of generative models and use them to attribute a synthetic image to the model that produced it. Most of the methods classify the models or the architectures among those in a closed set without considering the possibility that the system is fed with samples produced by unknown architectures. With the continuous progress of AI technology, new generative architectures continuously appear, thus driving the attention of researchers towards the development of tools capable of working in open-set scenarios. In this paper, we propose a framework for open set attribution of synthetic images, named BOSC (Backdoor-based Open Set Classification), that relies on the concept of backdoor attacks to design a classifier with rejection option. BOSC works by purposely injecting class-specific triggers inside a portion of the images in the training set to induce the network to establish a matching between class features and trigger features. The behavior of the trained model with respect to triggered samples is then exploited at test time to perform sample rejection using an ad-hoc score. Experiments show that the proposed method has good performance, always surpassing the state-of-the-art. Robustness against image processing is also very good. Although we designed our method for the task of synthetic image attribution, the proposed framework is a general one and can be used for other image forensic applications.

2.8CVMay 19, 2023

A One-Class Classifier for the Detection of GAN Manipulated Multi-Spectral Satellite Images

Lydia Abady, Giovanna Maria Dimitri, Mauro Barni

The highly realistic image quality achieved by current image generative models has many academic and industrial applications. To limit the use of such models to benign applications, though, it is necessary that tools to conclusively detect whether an image has been generated synthetically or not are developed. For this reason, several detectors have been developed providing excellent performance in computer vision applications, however, they can not be applied as they are to multispectral satellite images, and hence new models must be trained. In general, two-class classifiers can achieve very good detection accuracies, however they are not able to generalise to image domains and generative models architectures different than those used during training. For this reason, in this paper, we propose a one-class classifier based on Vector Quantized Variational Autoencoder 2 (VQ-VAE 2) features to overcome the limitations of two-class classifiers. First, we emphasize the generalization problem that binary classifiers suffer from by training and testing an EfficientNet-B4 architecture on multiple multispectral datasets. Then we show that, since the VQ-VAE 2 based classifier is trained only on pristine images, it is able to detect images belonging to different domains and generated by architectures that have not been used during training. Last, we compare the two classifiers head-to-head on the same generated datasets, highlighting the superiori generalization capabilities of the VQ-VAE 2-based detector.

29.2CRNov 16, 2021

An Overview of Backdoor Attacks Against Deep Neural Networks and Possible Defences

Wei Guo, Benedetta Tondi, Mauro Barni

Together with impressive advances touching every aspect of our society, AI technology based on Deep Neural Networks (DNN) is bringing increasing security concerns. While attacks operating at test time have monopolised the initial attention of researchers, backdoor attacks, exploiting the possibility of corrupting DNN models by interfering with the training process, represents a further serious threat undermining the dependability of AI techniques. In a backdoor attack, the attacker corrupts the training data so to induce an erroneous behaviour at test time. Test time errors, however, are activated only in the presence of a triggering event corresponding to a properly crafted input sample. In this way, the corrupted network continues to work as expected for regular inputs, and the malicious behaviour occurs only when the attacker decides to activate the backdoor hidden within the network. In the last few years, backdoor attacks have been the subject of an intense research activity focusing on both the development of new classes of attacks, and the proposal of possible countermeasures. The goal of this overview paper is to review the works published until now, classifying the different types of attacks and defences proposed so far. The classification guiding the analysis is based on the amount of control that the attacker has on the training process, and the capability of the defender to verify the integrity of the data used for training, and to monitor the operations of the DNN at training and test time. As such, the proposed analysis is particularly suited to highlight the strengths and weaknesses of both attacks and defences with reference to the application scenarios they are operating in.

3.7CVSep 10, 2021

Detection of GAN-synthesized street videos

Omran Alamayreh, Mauro Barni

Research on the detection of AI-generated videos has focused almost exclusively on face videos, usually referred to as deepfakes. Manipulations like face swapping, face reenactment and expression manipulation have been the subject of an intense research with the development of a number of efficient tools to distinguish artificial videos from genuine ones. Much less attention has been paid to the detection of artificial non-facial videos. Yet, new tools for the generation of such kind of videos are being developed at a fast pace and will soon reach the quality level of deepfake videos. The goal of this paper is to investigate the detectability of a new kind of AI-generated videos framing driving street sequences (here referred to as DeepStreets videos), which, by their nature, can not be analysed with the same tools used for facial deepfakes. Specifically, we present a simple frame-based detector, achieving very good performance on state-of-the-art DeepStreets videos generated by the Vid2vid architecture. Noticeably, the detector retains very good performance on compressed videos, even when the compression level used during training does not match that used for the test videos.

3.8CRMay 9, 2021

Improving Cost Learning for JPEG Steganography by Exploiting JPEG Domain Knowledge

Weixuan Tang, Bin Li, Mauro Barni et al.

Although significant progress in automatic learning of steganographic cost has been achieved recently, existing methods designed for spatial images are not well applicable to JPEG images which are more common media in daily life. The difficulties of migration mostly lie in the unique and complicated JPEG characteristics caused by 8x8 DCT mode structure. To address the issue, in this paper we extend an existing automatic cost learning scheme to JPEG, where the proposed scheme called JEC-RL (JPEG Embedding Cost with Reinforcement Learning) is explicitly designed to tailor the JPEG DCT structure. It works with the embedding action sampling mechanism under reinforcement learning, where a policy network learns the optimal embedding policies via maximizing the rewards provided by an environment network. The policy network is constructed following a domain-transition design paradigm, where three modules including pixel-level texture complexity evaluation, DCT feature extraction, and mode-wise rearrangement, are proposed. These modules operate in serial, gradually extracting useful features from a decompressed JPEG image and converting them into embedding policies for DCT elements, while considering JPEG characteristics including inter-block and intra-block correlations simultaneously. The environment network is designed in a gradient-oriented way to provide stable reward values by using a wide architecture equipped with a fixed preprocessing layer with 8x8 DCT basis filters. Extensive experiments and ablation studies demonstrate that the proposed method can achieve good security performance for JPEG images against both advanced feature based and modern CNN based steganalyzers.

7.3CVMay 1, 2021

A Master Key Backdoor for Universal Impersonation Attack against DNN-based Face Verification

Wei Guo, Benedetta Tondi, Mauro Barni

We introduce a new attack against face verification systems based on Deep Neural Networks (DNN). The attack relies on the introduction into the network of a hidden backdoor, whose activation at test time induces a verification error allowing the attacker to impersonate any user. The new attack, named Master Key backdoor attack, operates by interfering with the training phase, so to instruct the DNN to always output a positive verification answer when the face of the attacker is presented at its input. With respect to existing attacks, the new backdoor attack offers much more flexibility, since the attacker does not need to know the identity of the victim beforehand. In this way, he can deploy a Universal Impersonation attack in an open-set framework, allowing him to impersonate any enrolled users, even those that were not yet enrolled in the system when the attack was conceived. We present a practical implementation of the attack targeting a Siamese-DNN face verification system, and show its effectiveness when the system is trained on VGGFace2 dataset and tested on LFW and YTF datasets. According to our experiments, the Master Key backdoor attack provides a high attack success rate even when the ratio of poisoned training data is as small as 0.01, thus raising a new alarm regarding the use of DNN-based face verification systems in security-critical applications.

35.7CRMar 16, 2021

A survey of deep neural network watermarking techniques

Yue Li, Hongxia Wang, Mauro Barni

Protecting the Intellectual Property Rights (IPR) associated to Deep Neural Networks (DNNs) is a pressing need pushed by the high costs required to train such networks and the importance that DNNs are gaining in our society. Following its use for Multimedia (MM) IPR protection, digital watermarking has recently been considered as a mean to protect the IPR of DNNs. While DNN watermarking inherits some basic concepts and methods from MM watermarking, there are significant differences between the two application areas, calling for the adaptation of media watermarking techniques to the DNN scenario and the development of completely new methods. In this paper, we overview the most recent advances in DNN watermarking, by paying attention to cast it into the bulk of watermarking theory developed during the last two decades, while at the same time highlighting the new challenges and opportunities characterizing DNN watermarking. Rather than trying to present a comprehensive description of all the methods proposed so far, we introduce a new taxonomy of DNN watermarking and present a few exemplary methods belonging to each class. We hope that this paper will inspire new research in this exciting area and will help researchers to focus on the most innovative and challenging problems in the field.

6.1IVFeb 2, 2021

Image Splicing Detection, Localization and Attribution via JPEG Primary Quantization Matrix Estimation and Clustering

Yakun Niu, Benedetta Tondi, Yao Zhao et al.

Detection of inconsistencies of double JPEG artefacts across different image regions is often used to detect local image manipulations, like image splicing, and to localize them. In this paper, we move one step further, proposing an end-to-end system that, in addition to detecting and localizing spliced regions, can also distinguish regions coming from different donor images. We assume that both the spliced regions and the background image have undergone a double JPEG compression, and use a local estimate of the primary quantization matrix to distinguish between spliced regions taken from different sources. To do so, we cluster the image blocks according to the estimated primary quantization matrix and refine the result by means of morphological reconstruction. The proposed method can work in a wide variety of settings including aligned and non-aligned double JPEG compression, and regardless of whether the second compression is stronger or weaker than the first one. We validated the proposed approach by means of extensive experiments showing its superior performance with respect to baseline methods working in similar conditions.

1.4CVFeb 1, 2021

VIPPrint: A Large Scale Dataset of Printed and Scanned Images for Synthetic Face Images Detection and Source Linking

Anselmo Ferreira, Ehsan Nowroozi, Mauro Barni

The possibility of carrying out a meaningful forensics analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography pictures, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation or the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we present a new dataset composed of a large number of synthetic and natural printed face images. To highlight the difficulties associated with the analysis of the images of the dataset, we carried out an extensive set of experiments comparing several printer attribution methods. We also verified that state-of-the-art methods to distinguish natural and synthetic face images fail when applied to print and scanned images. We envision that the availability of the new dataset and the preliminary experiments we carried out will motivate and facilitate further research in this area.

24.2CRDec 28, 2020

Spread-Transform Dither Modulation Watermarking of Deep Neural Network

Yue Li, Benedetta Tondi, Mauro Barni

DNN watermarking is receiving an increasing attention as a suitable mean to protect the Intellectual Property Rights associated to DNN models. Several methods proposed so far are inspired to the popular Spread Spectrum (SS) paradigm according to which the watermark bits are embedded into the projection of the weights of the DNN model onto a pseudorandom sequence. In this paper, we propose a new DNN watermarking algorithm that leverages on the watermarking with side information paradigm to decrease the obtrusiveness of the watermark and increase its payload. In particular, the new scheme exploits the main ideas of ST-DM (Spread Transform Dither Modulation) watermarking to improve the performance of a recently proposed algorithm based on conventional SS. The experiments we carried out by applying the proposed scheme to watermark different models, demonstrate its capability to provide a higher payload with a lower impact on network accuracy than a baseline method based on conventional SS, while retaining a satisfactory level of robustness.

18.6CVJul 25, 2020

CNN Detection of GAN-Generated Face Images based on Cross-Band Co-occurrences Analysis

Mauro Barni, Kassem Kallas, Ehsan Nowroozi et al.

Last-generation GAN models allow to generate synthetic images which are visually indistinguishable from natural ones, raising the need to develop tools to distinguish fake and natural images thus contributing to preserve the trustworthiness of digital images. While modern GAN models can generate very high-quality images with no visible spatial artifacts, reconstruction of consistent relationships among colour channels is expectedly more difficult. In this paper, we propose a method for distinguishing GAN-generated from natural images by exploiting inconsistencies among spectral bands, with specific focus on the generation of synthetic face images. Specifically, we use cross-band co-occurrence matrices, in addition to spatial co-occurrence matrices, as input to a CNN model, which is trained to distinguish between real and synthetic faces. The results of our experiments confirm the goodness of our approach which outperforms a similar detection technique based on intra-band spatial co-occurrences only. The performance gain is particularly significant with regard to robustness against post-processing, like geometric transformations, filtering and contrast manipulations.

1.2CVMay 12, 2020

Increased-confidence adversarial examples for deep learning counter-forensics

Wenjie Li, Benedetta Tondi, Rongrong Ni et al.

Transferability of adversarial examples is a key issue to apply this kind of attacks against multimedia forensics (MMF) techniques based on Deep Learning (DL) in a real-life setting. Adversarial example transferability, in fact, would open the way to the deployment of successful counter forensics attacks also in cases where the attacker does not have a full knowledge of the to-be-attacked system. Some preliminary works have shown that adversarial examples against CNN-based image forensics detectors are in general non-transferrable, at least when the basic versions of the attacks implemented in the most popular libraries are adopted. In this paper, we introduce a general strategy to increase the strength of the attacks and evaluate their transferability when such a strength varies. We experimentally show that, in this way, attack transferability can be largely increased, at the expense of a larger distortion. Our research confirms the security threats posed by the existence of adversarial examples even in multimedia forensics scenarios, thus calling for new defense strategies to improve the security of DL-based MMF techniques.

8.8CRMar 26, 2020

Challenging the adversarial robustness of DNNs based on error-correcting output codes

Bowen Zhang, Benedetta Tondi, Xixiang Lv et al.

The existence of adversarial examples and the easiness with which they can be generated raise several security concerns with regard to deep learning systems, pushing researchers to develop suitable defense mechanisms. The use of networks adopting error-correcting output codes (ECOC) has recently been proposed to counter the creation of adversarial examples in a white-box setting. In this paper, we carry out an in-depth investigation of the adversarial robustness achieved by the ECOC approach. We do so by proposing a new adversarial attack specifically designed for multi-label classification architectures, like the ECOC-based one, and by applying two existing attacks. In contrast to previous findings, our analysis reveals that ECOC-based networks can be attacked quite easily by introducing a small adversarial perturbation. Moreover, the adversarial examples can be generated in such a way to achieve high probabilities for the predicted target class, hence making it difficult to use the prediction confidence to detect them. Our findings are proven by means of experimental results obtained on MNIST, CIFAR-10 and GTSRB classification tasks.

7.1CVDec 29, 2019Code

Copy Move Source-Target Disambiguation through Multi-Branch CNNs

Mauro Barni, Quoc-Tin Phan, Benedetta Tondi

We propose a method to identify the source and target regions of a copy-move forgery so allow a correct localisation of the tampered area. First, we cast the problem into a hypothesis testing framework whose goal is to decide which region between the two nearly-duplicate regions detected by a generic copy-move detector is the original one. Then we design a multi-branch CNN architecture that solves the hypothesis testing problem by learning a set of features capable to reveal the presence of interpolation artefacts and boundary inconsistencies in the copy-moved area. The proposed architecture, trained on a synthetic dataset explicitly built for this purpose, achieves good results on copy-move forgeries from both synthetic and realistic datasets. Based on our tests, the proposed disambiguation method can reliably reveal the target region even in realistic cases where an approximate version of the copy-move localization mask is provided by a state-of-the-art copy-move detection algorithm.

9.7CROct 25, 2019

Effectiveness of random deep feature selection for securing image manipulation detectors against adversarial examples

Mauro Barni, Ehsan Nowroozi, Benedetta Tondi et al.

We investigate if the random feature selection approach proposed in [1] to improve the robustness of forensic detectors to targeted attacks, can be extended to detectors based on deep learning features. In particular, we study the transferability of adversarial examples targeting an original CNN image manipulation detector to other detectors (a fully connected neural network and a linear SVM) that rely on a random subset of the features extracted from the flatten layer of the original network. The results we got by considering three image manipulation detection tasks (resizing, median filtering and adaptive histogram equalization), two original network architectures and three classes of attacks, show that feature randomization helps to hinder attack transferability, even if, in some cases, simply changing the architecture of the detector, or even retraining the detector is enough to prevent the transferability of the attacks.

8.3CROct 1, 2019

Attacking CNN-based anti-spoofing face authentication in the physical domain

Bowen Zhang, Benedetta Tondi, Mauro Barni

In this paper, we study the vulnerability of anti-spoofing methods based on deep learning against adversarial perturbations. We first show that attacking a CNN-based anti-spoofing face authentication system turns out to be a difficult task. When a spoofed face image is attacked in the physical world, in fact, the attack has not only to remove the rebroadcast artefacts present in the image, but it has also to take into account that the attacked image will be recaptured again and then compensate for the distortions that will be re-introduced after the attack by the subsequent rebroadcast process. Subsequently, we propose a method to craft robust physical domain adversarial images against anti-spoofing CNN-based face authentication. The attack built in this way can successfully pass all the steps in the authentication chain (that is, face detection, face recognition and spoofing detection), by achieving simultaneously the following goals: i) make the spoofing detection fail; ii) let the facial region be detected as a face and iii) recognized as belonging to the victim of the attack. The effectiveness of the proposed attack is validated experimentally within a realistic setting, by considering the REPLAY-MOBILE database, and by feeding the adversarial images to a real face authentication system capturing the input images through a mobile phone camera.

3.3MMJun 3, 2019

CNN-based Steganalysis and Parametric Adversarial Embedding: a Game-Theoretic Framework

Xiaoyu Shi, Benedetta Tondi, Bin Li et al.

CNN-based steganalysis has recently achieved very good performance in detecting content-adaptive steganography. At the same time, recent works have shown that, by adopting an approach similar to that used to build adversarial examples, a steganographer can adopt an adversarial embedding strategy to effectively counter a target CNN steganalyzer. In turn, the good performance of the steganalyzer can be restored by retraining the CNN with adversarial stego images. A problem with this model is that, arguably, at training time the steganalizer is not aware of the exact parameters used by the steganograher for adversarial embedding and, vice versa, the steganographer does not know how the images that will be used to train the steganalyzer are generated. In order to exit this apparent deadlock, we introduce a game theoretic framework wherein the problem of setting the parameters of the steganalyzer and the steganographer is solved in a strategic way. More specifically, a non-zero sum game is first formulated to model the problem, and then instantiated by considering a specific adversarial embedding scheme setting its operating parameters in a game-theoretic fashion. Our analysis shows that the equilibrium solution of the non zero-sum game can be conveniently found by solving an associated zero-sum game, thus reducing greatly the complexity of the problem. Then we run several experiments to derive the optimum strategies for the steganographer and the staganalyst in a game-theoretic sense, and to evaluate the performance of the game at the equilibrium, characterizing the loss with respect to the conventional non-adversarial case. Eventually, by leveraging on the analysis of the equilibrium point of the game, we introduce a new strategy to improve the reliability of the steganalysis, which shows the benefits of addressing the security issue in a game-theoretic perspective.

12.0CRFeb 22, 2019

Improving the security of Image Manipulation Detection through One-and-a-half-class Multiple Classification

Mauro Barni, Ehsan Nowroozi, Benedetta Tondi

Protecting image manipulation detectors against perfect knowledge attacks requires the adoption of detector architectures which are intrinsically difficult to attack. In this paper, we do so, by exploiting a recently proposed multiple-classifier architecture combining the improved security of 1-Class (1C) classification and the good performance ensured by conventional 2-Class (2C) classification in the absence of attacks. The architecture, also known as 1.5-Class (1.5C) classifier, consists of one 2C classifier and two 1C classifiers run in parallel followed by a final 1C classifier. In our system, the first three classifiers are implemented by means of Support Vector Machines (SVM) fed with SPAM features. The outputs of such classifiers are then processed by a final 1C SVM in charge of making the final decision. Particular care is taken to design a proper strategy to train the SVMs the 1.5C classifier relies on. This is a crucial task, due to the difficulty of training the two 1C classifiers at the front end of the system. We assessed the performance of the proposed solution with regard to three manipulation detection tasks, namely image resizing, contrast enhancement and median filtering. As a result, the security improvement allowed by the 1.5C architecture with respect to a conventional 2C solution is confirmed, with a performance loss in the absence of attacks that remains at a negligible level.

44.3CRFeb 12, 2019

A new Backdoor Attack in CNNs by training set corruption without label poisoning

Mauro Barni, Kassem Kallas, Benedetta Tondi

Backdoor attacks against CNNs represent a new threat against deep learning systems, due to the possibility of corrupting the training set so to induce an incorrect behaviour at test time. To avoid that the trainer recognises the presence of the corrupted samples, the corruption of the training set must be as stealthy as possible. Previous works have focused on the stealthiness of the perturbation injected into the training samples, however they all assume that the labels of the corrupted samples are also poisoned. This greatly reduces the stealthiness of the attack, since samples whose content does not agree with the label can be identified by visual inspection of the training set or by running a pre-classification step. In this paper we present a new backdoor attack without label poisoning Since the attack works by corrupting only samples of the target class, it has the additional advantage that it does not need to identify beforehand the class of the samples to be attacked at test time. Results obtained on the MNIST digits recognition task and the traffic signs classification task show that backdoor attacks without label poisoning are indeed possible, thus raising a new alarm regarding the use of deep learning in security-critical applications.

20.4CRNov 5, 2018

On the Transferability of Adversarial Examples Against CNN-Based Image Forensics

Mauro Barni, Kassem Kallas, Ehsan Nowroozi et al.

Recent studies have shown that Convolutional Neural Networks (CNN) are relatively easy to attack through the generation of so-called adversarial examples. Such vulnerability also affects CNN-based image forensic tools. Research in deep learning has shown that adversarial examples exhibit a certain degree of transferability, i.e., they maintain part of their effectiveness even against CNN models other than the one targeted by the attack. This is a very strong property undermining the usability of CNN's in security-oriented applications. In this paper, we investigate if attack transferability also holds in image forensics applications. With specific reference to the case of manipulation detection, we analyse the results of several experiments considering different sources of mismatch between the CNN used to build the adversarial examples and the one adopted by the forensic analyst. The analysis ranges from cases in which the mismatch involves only the training dataset, to cases in which the attacker and the forensic analyst adopt different architectures. The results of our experiments show that, in the majority of the cases, the attacks are not transferable, thus easing the design of proper countermeasures at least when the attacker does not have a perfect knowledge of the target detector.

8.5CRMay 29, 2018

CNN-Based Detection of Generic Constrast Adjustment with JPEG Post-processing

Mauro Barni, Andrea Costanzo, Ehsan Nowroozi et al.

Detection of contrast adjustments in the presence of JPEG postprocessing is known to be a challenging task. JPEG post processing is often applied innocently, as JPEG is the most common image format, or it may correspond to a laundering attack, when it is purposely applied to erase the traces of manipulation. In this paper, we propose a CNN-based detector for generic contrast adjustment, which is robust to JPEG compression. The proposed system relies on a patch-based Convolutional Neural Network (CNN), trained to distinguish pristine images from contrast adjusted images, for some selected adjustment operators of different nature. Robustness to JPEG compression is achieved by training the CNN with JPEG examples, compressed over a range of Quality Factors (QFs). Experimental results show that the detector works very well and scales well with respect to the adjustment type, yielding very good performance under a large variety of unseen tonal adjustments.

2.3CRMar 28, 2018

SEMBA:SEcure multi-biometric authentication

Giulia Droandi, Mauro Barni, Riccardo Lazzeretti et al.

Biometrics security is a dynamic research area spurred by the need to protect personal traits from threats like theft, non-authorised distribution, reuse and so on. A widely investigated solution to such threats consists in processing the biometric signals under encryption, to avoid any leakage of information towards non-authorised parties. In this paper, we propose to leverage on the superior performance of multimodal biometric recognition to improve the efficiency of a biometric-based authentication protocol operating on encrypted data under the malicious security model. In the proposed protocol, authentication relies on both facial and iris biometrics, whose representation accuracy is specifically tailored to trade-off between recognition accuracy and efficiency. From a cryptographic point of view, the protocol relies on SPDZ a new multy-party computation tool designed by Damgaard et al. Experimental results show that the multimodal protocol is faster than corresponding unimodal protocols achieving the same accuracy.

14.0CRFeb 2, 2018

Secure Detection of Image Manipulation by means of Random Feature Selection

Zhipeng Chen, Benedetta Tondi, Xiaolong Li et al.

We address the problem of data-driven image manipulation detection in the presence of an attacker with limited knowledge about the detector. Specifically, we assume that the attacker knows the architecture of the detector, the training data and the class of features V the detector can rely on. In order to get an advantage in his race of arms with the attacker, the analyst designs the detector by relying on a subset of features chosen at random in V. Given its ignorance about the exact feature set, the adversary attacks a version of the detector based on the entire feature set. In this way, the effectiveness of the attack diminishes since there is no guarantee that attacking a detector working in the full feature space will result in a successful attack against the reduced-feature detector. We theoretically prove that, thanks to random feature selection, the security of the detector increases significantly at the expense of a negligible loss of performance in the absence of attacks. We also provide an experimental validation of the proposed procedure by focusing on the detection of two specific kinds of image manipulations, namely adaptive histogram equalization and median filtering. The experiments confirm the gain in security at the expense of a negligible loss of performance in the absence of attacks.

27.5CRAug 2, 2017

Aligned and Non-Aligned Double JPEG Detection Using Convolutional Neural Networks

Mauro Barni, Luca Bondi, Nicolò Bonettini et al.

Due to the wide diffusion of JPEG coding standard, the image forensic community has devoted significant attention to the development of double JPEG (DJPEG) compression detectors through the years. The ability of detecting whether an image has been compressed twice provides paramount information toward image authenticity assessment. Given the trend recently gained by convolutional neural networks (CNN) in many computer vision tasks, in this paper we propose to use CNNs for aligned and non-aligned double JPEG compression detection. In particular, we explore the capability of CNNs to capture DJPEG artifacts directly from images. Results show that the proposed CNN-based detectors achieve good performance even with small size images (i.e., 64x64), outperforming state-of-the-art solutions, especially in the non-aligned case. Besides, good results are also achieved in the commonly-recognized challenging case in which the first quality factor is larger than the second one.

9.1CRMar 27, 2017

Adversarial Source Identification Game with Corrupted Training

Mauro Barni, Benedetta Tondi

We study a variant of the source identification game with training data in which part of the training data is corrupted by an attacker. In the addressed scenario, the defender aims at deciding whether a test sequence has been drawn according to a discrete memoryless source $X \sim P_X$, whose statistics are known to him through the observation of a training sequence generated by $X$. In order to undermine the correct decision under the alternative hypothesis that the test sequence has not been drawn from $X$, the attacker can modify a sequence produced by a source $Y \sim P_Y$ up to a certain distortion, and corrupt the training sequence either by adding some fake samples or by replacing some samples with fake ones. We derive the unique rationalizable equilibrium of the two versions of the game in the asymptotic regime and by assuming that the defender bases its decision by relying only on the first order statistics of the test and the training sequences. By mimicking Stein's lemma, we derive the best achievable performance for the defender when the first type error probability is required to tend to zero exponentially fast with an arbitrarily small, yet positive, error exponent. We then use such a result to analyze the ultimate distinguishability of any two sources as a function of the allowed distortion and the fraction of corrupted samples injected into the training sequence.

1.2SYJul 2, 2015

A Game-Theoretic Framework for Optimum Decision Fusion in the Presence of Byzantines

Andrea Abrardo, Mauro Barni, Kassem Kallas et al.

Optimum decision fusion in the presence of malicious nodes - often referred to as Byzantines - is hindered by the necessity of exactly knowing the statistical behavior of Byzantines. By focusing on a simple, yet widely studied, set-up in which a Fusion Center (FC) is asked to make a binary decision about a sequence of system states by relying on the possibly corrupted decisions provided by local nodes, we propose a game-theoretic framework which permits to exploit the superior performance provided by optimum decision fusion, while limiting the amount of a-priori knowledge required. We first derive the optimum decision strategy by assuming that the statistical behavior of the Byzantines is known. Then we relax such an assumption by casting the problem into a game-theoretic framework in which the FC tries to guess the behavior of the Byzantines, which, in turn, must fix their corruption strategy without knowing the guess made by the FC. We use numerical simulations to derive the equilibrium of the game, thus identifying the optimum behavior for both the FC and the Byzantines, and to evaluate the achievable performance at the equilibrium. We analyze several different setups, showing that in all cases the proposed solution permits to improve the accuracy of data fusion. We also show that, in some instances, it is preferable for the Byzantines to minimize the mutual information between the status of the observed system and the reports submitted to the FC, rather than always flipping the decision made by the local nodes as it is customarily assumed in previous works.

1.2SYMar 19, 2015

Optimum Fusion of Possibly Corrupted Reports for Distributed Detection in Multi-Sensor Networks

Andrea Abrardo, Mauro Barni, Kassem Kallas et al.

The most common approach to mitigate the impact that the presence of malicious nodes has on the accuracy of decision fusion schemes consists in observing the behavior of the nodes over a time interval T and then removing the reports of suspect nodes from the fusion process. By assuming that some a-priori information about the presence of malicious nodes and their behavior is available, we show that the information stemming from the suspect nodes can be exploited to further improve the decision fusion accuracy. Specifically, we derive the optimum fusion rule and analyze the achievable performance for two specific cases. In the first case, the states of the nodes (corrupted or honest) are independent of each other and the fusion center knows only the probability that a node is malicious. In the second case, the exact number of corrupted nodes is fixed and known to the fusion center. We also investigate the optimum corruption strategy for the malicious nodes, showing that always reverting the local decision does not necessarily maximize the loss of performance at the fusion center.

1.2ITMar 7, 2014

Compressive Hyperspectral Imaging Using Progressive Total Variation

Simeon Kamdem Kuiteing, Giulio Coluccia, Alessandro Barducci et al.

Compressed Sensing (CS) is suitable for remote acquisition of hyperspectral images for earth observation, since it could exploit the strong spatial and spectral correlations, llowing to simplify the architecture of the onboard sensors. Solutions proposed so far tend to decouple spatial and spectral dimensions to reduce the complexity of the reconstruction, not taking into account that onboard sensors progressively acquire spectral rows rather than acquiring spectral channels. For this reason, we propose a novel progressive CS architecture based on separate sensing of spectral rows and joint reconstruction employing Total Variation. Experimental results run on raw AVIRIS and AIRS images confirm the validity of the proposed system.