CRMay 31, 2022
CASSOCK: Viable Backdoor Attacks against DNN in The Wall of Source-Specific Backdoor DefencesShang Wang, Yansong Gao, Anmin Fu et al. · nvidia, utoronto
As a critical threat to deep neural networks (DNNs), backdoor attacks can be categorized into two types, i.e., source-agnostic backdoor attacks (SABAs) and source-specific backdoor attacks (SSBAs). Compared to traditional SABAs, SSBAs are more advanced in that they have superior stealthier in bypassing mainstream countermeasures that are effective against SABAs. Nonetheless, existing SSBAs suffer from two major limitations. First, they can hardly achieve a good trade-off between ASR (attack success rate) and FPR (false positive rate). Besides, they can be effectively detected by the state-of-the-art (SOTA) countermeasures (e.g., SCAn). To address the limitations above, we propose a new class of viable source-specific backdoor attacks, coined as CASSOCK. Our key insight is that trigger designs when creating poisoned data and cover data in SSBAs play a crucial role in demonstrating a viable source-specific attack, which has not been considered by existing SSBAs. With this insight, we focus on trigger transparency and content when crafting triggers for poisoned dataset where a sample has an attacker-targeted label and cover dataset where a sample has a ground-truth label. Specifically, we implement $CASSOCK_{Trans}$ and $CASSOCK_{Cont}$. While both they are orthogonal, they are complementary to each other, generating a more powerful attack, called $CASSOCK_{Comp}$, with further improved attack performance and stealthiness. We perform a comprehensive evaluation of the three $CASSOCK$-based attacks on four popular datasets and three SOTA defenses. Compared with a representative SSBA as a baseline ($SSBA_{Base}$), $CASSOCK$-based attacks have significantly advanced the attack performance, i.e., higher ASR and lower FPR with comparable CDA (clean data accuracy). Besides, $CASSOCK$-based attacks have effectively bypassed the SOTA defenses, and $SSBA_{Base}$ cannot.
AIApr 5, 2022
Towards Explainable Meta-Learning for DDoS DetectionQianru Zhou, Rongzhen Li, Lei Xu et al.
The Internet is the most complex machine humankind has ever built, and how to defense it from intrusions is even more complex. With the ever increasing of new intrusions, intrusion detection task rely on Artificial Intelligence more and more. Interpretability and transparency of the machine learning model is the foundation of trust in AI-driven intrusion detection results. Current interpretation Artificial Intelligence technologies in intrusion detection are heuristic, which is neither accurate nor sufficient. This paper proposed a rigorous interpretable Artificial Intelligence driven intrusion detection approach, based on artificial immune system. Details of rigorous interpretation calculation process for a decision tree model is presented. Prime implicant explanation for benign traffic flow are given in detail as rule for negative selection of the cyber immune system. Experiments are carried out in real-life traffic.
LGFeb 3, 2023
Vertical Federated Learning: Taxonomies, Threats, and ProspectsQun Li, Chandra Thapa, Lawrence Ong et al.
Federated learning (FL) is the most popular distributed machine learning technique. FL allows machine-learning models to be trained without acquiring raw data to a single point for processing. Instead, local models are trained with local data; the models are then shared and combined. This approach preserves data privacy as locally trained models are shared instead of the raw data themselves. Broadly, FL can be divided into horizontal federated learning (HFL) and vertical federated learning (VFL). For the former, different parties hold different samples over the same set of features; for the latter, different parties hold different feature data belonging to the same set of samples. In a number of practical scenarios, VFL is more relevant than HFL as different companies (e.g., bank and retailer) hold different features (e.g., credit history and shopping history) for the same set of customers. Although VFL is an emerging area of research, it is not well-established compared to HFL. Besides, VFL-related studies are dispersed, and their connections are not intuitive. Thus, this survey aims to bring these VFL-related studies to one place. Firstly, we classify existing VFL structures and algorithms. Secondly, we present the threats from security and privacy perspectives to VFL. Thirdly, for the benefit of future researchers, we discussed the challenges and prospects of VFL in detail.
CVSep 6, 2022
TransCAB: Transferable Clean-Annotation Backdoor to Object Detection with Natural Trigger in Real-WorldHua Ma, Yinshan Li, Yansong Gao et al.
Object detection is the foundation of various critical computer-vision tasks such as segmentation, object tracking, and event detection. To train an object detector with satisfactory accuracy, a large amount of data is required. However, due to the intensive workforce involved with annotating large datasets, such a data curation task is often outsourced to a third party or relied on volunteers. This work reveals severe vulnerabilities of such data curation pipeline. We propose MACAB that crafts clean-annotated images to stealthily implant the backdoor into the object detectors trained on them even when the data curator can manually audit the images. We observe that the backdoor effect of both misclassification and the cloaking are robustly achieved in the wild when the backdoor is activated with inconspicuously natural physical triggers. Backdooring non-classification object detection with clean-annotation is challenging compared to backdooring existing image classification tasks with clean-label, owing to the complexity of having multiple objects within each frame, including victim and non-victim objects. The efficacy of the MACAB is ensured by constructively i abusing the image-scaling function used by the deep learning framework, ii incorporating the proposed adversarial clean image replica technique, and iii combining poison data selection criteria given constrained attacking budget. Extensive experiments demonstrate that MACAB exhibits more than 90% attack success rate under various real-world scenes. This includes both cloaking and misclassification backdoor effect even restricted with a small attack budget. The poisoned samples cannot be effectively identified by state-of-the-art detection techniques.The comprehensive video demo is at https://youtu.be/MA7L_LpXkp4, which is based on a poison rate of 0.14% for YOLOv4 cloaking backdoor and Faster R-CNN misclassification backdoor.
CRApr 13, 2022
Towards A Critical Evaluation of Robustness for Deep Learning Backdoor CountermeasuresHuming Qiu, Hua Ma, Zhi Zhang et al.
Since Deep Learning (DL) backdoor attacks have been revealed as one of the most insidious adversarial attacks, a number of countermeasures have been developed with certain assumptions defined in their respective threat models. However, the robustness of these countermeasures is inadvertently ignored, which can introduce severe consequences, e.g., a countermeasure can be misused and result in a false implication of backdoor detection. For the first time, we critically examine the robustness of existing backdoor countermeasures with an initial focus on three influential model-inspection ones that are Neural Cleanse (S&P'19), ABS (CCS'19), and MNTD (S&P'21). Although the three countermeasures claim that they work well under their respective threat models, they have inherent unexplored non-robust cases depending on factors such as given tasks, model architectures, datasets, and defense hyper-parameter, which are \textit{not even rooted from delicate adaptive attacks}. We demonstrate how to trivially bypass them aligned with their respective threat models by simply varying aforementioned factors. Particularly, for each defense, formal proofs or empirical studies are used to reveal its two non-robust cases where it is not as robust as it claims or expects, especially the recent MNTD. This work highlights the necessity of thoroughly evaluating the robustness of backdoor countermeasures to avoid their misleading security implications in unknown non-robust cases.
CROct 1, 2023
Watch Out! Simple Horizontal Class Backdoor Can Trivially Evade DefenseHua Ma, Shang Wang, Yansong Gao et al.
All current backdoor attacks on deep learning (DL) models fall under the category of a vertical class backdoor (VCB) -- class-dependent. In VCB attacks, any sample from a class activates the implanted backdoor when the secret trigger is present. Existing defense strategies overwhelmingly focus on countering VCB attacks, especially those that are source-class-agnostic. This narrow focus neglects the potential threat of other simpler yet general backdoor types, leading to false security implications. This study introduces a new, simple, and general type of backdoor attack coined as the horizontal class backdoor (HCB) that trivially breaches the class dependence characteristic of the VCB, bringing a fresh perspective to the community. HCB is now activated when the trigger is presented together with an innocuous feature, regardless of class. For example, the facial recognition model misclassifies a person who wears sunglasses with a smiling innocuous feature into the targeted person, such as an administrator, regardless of which person. The key is that these innocuous features are horizontally shared among classes but are only exhibited by partial samples per class. Extensive experiments on attacking performance across various tasks, including MNIST, facial recognition, traffic sign recognition, object detection, and medical diagnosis, confirm the high efficiency and effectiveness of the HCB. We rigorously evaluated the evasiveness of the HCB against a series of eleven representative countermeasures, including Fine-Pruning (RAID 18'), STRIP (ACSAC 19'), Neural Cleanse (Oakland 19'), ABS (CCS 19'), Februus (ACSAC 20'), NAD (ICLR 21'), MNTD (Oakland 21'), SCAn (USENIX SEC 21'), MOTH (Oakland 22'), Beatrix (NDSS 23'), and MM-BD (Oakland 24'). None of these countermeasures prove robustness, even when employing a simplistic trigger, such as a small and static white-square patch.
70.9CRApr 24
ArmSSL: Adversarial Robust Black-Box Watermarking for Self-Supervised Learning Pre-trained EncodersYongqi Jiang, Yansong Gao, Boyu Kuang et al.
Self-supervised learning (SSL) encoders are invaluable intellectual property (IP). However, no existing SSL watermarking for IP protection can concurrently satisfy the following two practical requirements: (1) provide ownership verification capability under black-box suspect model access once the stolen encoders are used in downstream tasks; (2) be robust under adversarial watermark detection or removal, because the watermark samples form a distinguishable out-of-distribution (OOD) cluster. We propose ArmSSL, an SSL watermarking framework that assures black-box verifiability and adversarial robustness while preserving utility. For verification, we introduce paired discrepancy enlargement, enforcing feature-space orthogonality between the clean and its watermark counterpart to produce a reliable verification signal in black-box against the suspect model. For adversarial robustness, ArmSSL integrates latent representation entanglement and distribution alignment to suppress the OOD clustering. The former entangles watermark representations with clean representations (i.e., from non-source-class) to avoid forming a dense cluster of watermark samples, while the latter minimizes the distributional discrepancy between watermark and clean representations, thereby disguising watermark samples as natural in-distribution data. For utility, a reference-guided watermark tuning strategy is designed to allow the watermark to be learned as a small side task without affecting the main task by aligning the watermarked encoder's outputs with those of the original clean encoder on normal data. Extensive experiments across five mainstream SSL frameworks and nine benchmark datasets, along with end-to-end comparisons with SOTAs, demonstrate that ArmSSL achieves superior ownership verification, negligible utility degradation, and strong robustness against various adversarial detection and removal.
LGAug 12, 2024
TruVRF: Towards Triple-Granularity Verification on Machine UnlearningChunyi Zhou, Anmin Fu, Zhiyang Dai
The concept of the right to be forgotten has led to growing interest in machine unlearning, but reliable validation methods are lacking, creating opportunities for dishonest model providers to mislead data contributors. Traditional invasive methods like backdoor injection are not feasible for legacy data. To address this, we introduce TruVRF, a non-invasive unlearning verification framework operating at class-, volume-, and sample-level granularities. TruVRF includes three Unlearning-Metrics designed to detect different types of dishonest servers: Neglecting, Lazy, and Deceiving. Unlearning-Metric-I checks class alignment, Unlearning-Metric-II verifies sample count, and Unlearning-Metric-III confirms specific sample deletion. Evaluations on three datasets show TruVRF's robust performance, with over 90% accuracy for Metrics I and III, and a 4.8% to 8.2% inference deviation for Metric II. TruVRF also demonstrates generalizability and practicality across various conditions and with state-of-the-art unlearning frameworks like SISA and Amnesiac Unlearning.
59.2CRMay 3
Repurposing and Evaluating the (In)Feasibility of Dataset Poisoning enabled Watermarking for Contrastive LearningZhiyang Dai, Yansong Gao, Boyu Kuang et al.
Contrastive learning (CL) reduces annotation cost via auto-derived supervisory signals. Since large-scale in-house CL datasets are infeasible, reliance on third-party or internet data is common. Recent studies show CL models are vulnerable to data-poisoning backdoor attacks, but their generalization and robustness are underexplored. We systematically evaluate existing data-poisoning backdoor attacks on CL, revealing limitations: poor dataset adaptability, low success rates, limited portability, and restrictive assumptions (e.g., downstream task knowledge). Interestingly, trigger samples exhibit distinguishable statistical divergence from clean samples, which inspires repurposing it as a watermark for dataset IP protection. Direct repurposing is challenging due to low success rates; we overcome this by statistical verification using a unified density metric. We further propose a multi-level watermarking scheme adapting to feature-level, soft-label, or hard-label outputs in CL. Experiments show some backdoor attacks can be repurposed as effective watermarks with trade-offs among fidelity, verifiability, and robustness. This work demonstrates weak backdoor effects become reliable signals for dataset IP protection in challenging CL settings.
LGMar 13, 2024
Machine Unlearning: Taxonomy, Metrics, Applications, Challenges, and ProspectsNa Li, Chunyi Zhou, Yansong Gao et al.
Personal digital data is a critical asset, and governments worldwide have enforced laws and regulations to protect data privacy. Data users have been endowed with the right to be forgotten of their data. In the course of machine learning (ML), the forgotten right requires a model provider to delete user data and its subsequent impact on ML models upon user requests. Machine unlearning emerges to address this, which has garnered ever-increasing attention from both industry and academia. While the area has developed rapidly, there is a lack of comprehensive surveys to capture the latest advancements. Recognizing this shortage, we conduct an extensive exploration to map the landscape of machine unlearning including the (fine-grained) taxonomy of unlearning algorithms under centralized and distributed settings, debate on approximate unlearning, verification and evaluation metrics, challenges and solutions for unlearning under different applications, as well as attacks targeting machine unlearning. The survey concludes by outlining potential directions for future research, hoping to serve as a guide for interested scholars.
LGMay 24, 2024
Decaf: Data Distribution Decompose Attack against Federated LearningZhiyang Dai, Chunyi Zhou, Anmin Fu
In contrast to prevalent Federated Learning (FL) privacy inference techniques such as generative adversarial networks attacks, membership inference attacks, property inference attacks, and model inversion attacks, we devise an innovative privacy threat: the Data Distribution Decompose Attack on FL, termed Decaf. This attack enables an honest-but-curious FL server to meticulously profile the proportion of each class owned by the victim FL user, divulging sensitive information like local market item distribution and business competitiveness. The crux of Decaf lies in the profound observation that the magnitude of local model gradient changes closely mirrors the underlying data distribution, including the proportion of each class. Decaf addresses two crucial challenges: accurately identify the missing/null class(es) given by any victim user as a premise and then quantify the precise relationship between gradient changes and each remaining non-null class. Notably, Decaf operates stealthily, rendering it entirely passive and undetectable to victim users regarding the infringement of their data distribution privacy. Experimental validation on five benchmark datasets (MNIST, FASHION-MNIST, CIFAR-10, FER-2013, and SkinCancer) employing diverse model architectures, including customized convolutional networks, standardized VGG16, and ResNet18, demonstrates Decaf's efficacy. Results indicate its ability to accurately decompose local user data distribution, regardless of whether it is IID or non-IID distributed. Specifically, the dissimilarity measured using $L_{\infty}$ distance between the distribution decomposed by Decaf and ground truth is consistently below 5\% when no null classes exist. Moreover, Decaf achieves 100\% accuracy in determining any victim user's null classes, validated through formal proof.
CRNov 7, 2024
Intellectual Property Protection for Deep Learning Model and Dataset IntelligenceYongqi Jiang, Yansong Gao, Chunyi Zhou et al.
With the growing applications of Deep Learning (DL), especially recent spectacular achievements of Large Language Models (LLMs) such as ChatGPT and LLaMA, the commercial significance of these remarkable models has soared. However, acquiring well-trained models is costly and resource-intensive. It requires a considerable high-quality dataset, substantial investment in dedicated architecture design, expensive computational resources, and efforts to develop technical expertise. Consequently, safeguarding the Intellectual Property (IP) of well-trained models is attracting increasing attention. In contrast to existing surveys overwhelmingly focusing on model IPP mainly, this survey not only encompasses the protection on model level intelligence but also valuable dataset intelligence. Firstly, according to the requirements for effective IPP design, this work systematically summarizes the general and scheme-specific performance evaluation metrics. Secondly, from proactive IP infringement prevention and reactive IP ownership verification perspectives, it comprehensively investigates and analyzes the existing IPP methods for both dataset and model intelligence. Additionally, from the standpoint of training settings, it delves into the unique challenges that distributed settings pose to IPP compared to centralized settings. Furthermore, this work examines various attacks faced by deep IPP techniques. Finally, we outline prospects for promising future directions that may act as a guide for innovative research.
CRMar 6, 2025
From Pixels to Trajectory: Universal Adversarial Example Detection via Temporal ImprintsYansong Gao, Huaibing Peng, Hua Ma et al.
For the first time, we unveil discernible temporal (or historical) trajectory imprints resulting from adversarial example (AE) attacks. Standing in contrast to existing studies all focusing on spatial (or static) imprints within the targeted underlying victim models, we present a fresh temporal paradigm for understanding these attacks. Of paramount discovery is that these imprints are encapsulated within a single loss metric, spanning universally across diverse tasks such as classification and regression, and modalities including image, text, and audio. Recognizing the distinct nature of loss between adversarial and clean examples, we exploit this temporal imprint for AE detection by proposing TRAIT (TRaceable Adversarial temporal trajectory ImprinTs). TRAIT operates under minimal assumptions without prior knowledge of attacks, thereby framing the detection challenge as a one-class classification problem. However, detecting AEs is still challenged by significant overlaps between the constructed synthetic losses of adversarial and clean examples due to the absence of ground truth for incoming inputs. TRAIT addresses this challenge by converting the synthetic loss into a spectrum signature, using the technique of Fast Fourier Transform to highlight the discrepancies, drawing inspiration from the temporal nature of the imprints, analogous to time-series signals. Across 12 AE attacks including SMACK (USENIX Sec'2023), TRAIT demonstrates consistent outstanding performance across comprehensively evaluated modalities, tasks, datasets, and model architectures. In all scenarios, TRAIT achieves an AE detection accuracy exceeding 97%, often around 99%, while maintaining a false rejection rate of 1%. TRAIT remains effective under the formulated strong adaptive attacks.
CRJul 22, 2025
CompLeak: Deep Learning Model Compression Exacerbates Privacy LeakageNa Li, Yansong Gao, Hongsheng Hu et al.
Model compression is crucial for minimizing memory storage and accelerating inference in deep learning (DL) models, including recent foundation models like large language models (LLMs). Users can access different compressed model versions according to their resources and budget. However, while existing compression operations primarily focus on optimizing the trade-off between resource efficiency and model performance, the privacy risks introduced by compression remain overlooked and insufficiently understood. In this work, through the lens of membership inference attack (MIA), we propose CompLeak, the first privacy risk evaluation framework examining three widely used compression configurations that are pruning, quantization, and weight clustering supported by the commercial model compression framework of Google's TensorFlow-Lite (TF-Lite) and Facebook's PyTorch Mobile. CompLeak has three variants, given available access to the number of compressed models and original model. CompLeakNR starts by adopting existing MIA methods to attack a single compressed model, and identifies that different compressed models influence members and non-members differently. When the original model and one compressed model are available, CompLeakSR leverages the compressed model as a reference to the original model and uncovers more privacy by combining meta information (e.g., confidence vector) from both models. When multiple compressed models are available with/without accessing the original model, CompLeakMR innovatively exploits privacy leakage info from multiple compressed versions to substantially signify the overall privacy leakage. We conduct extensive experiments on seven diverse model architectures (from ResNet to foundation models of BERT and GPT-2), and six image and textual benchmark datasets.
CRJun 28, 2025
Kill Two Birds with One Stone! Trajectory enabled Unified Online Detection of Adversarial Examples and Backdoor AttacksAnmin Fu, Fanyu Meng, Huaibing Peng et al.
The proposed UniGuard is the first unified online detection framework capable of simultaneously addressing adversarial examples and backdoor attacks. UniGuard builds upon two key insights: first, both AE and backdoor attacks have to compromise the inference phase, making it possible to tackle them simultaneously during run-time via online detection. Second, an adversarial input, whether a perturbed sample in AE attacks or a trigger-carrying sample in backdoor attacks, exhibits distinctive trajectory signatures from a benign sample as it propagates through the layers of a DL model in forward inference. The propagation trajectory of the adversarial sample must deviate from that of its benign counterpart; otherwise, the adversarial objective cannot be fulfilled. Detecting these trajectory signatures is inherently challenging due to their subtlety; UniGuard overcomes this by treating the propagation trajectory as a time-series signal, leveraging LSTM and spectrum transformation to amplify differences between adversarial and benign trajectories that are subtle in the time domain. UniGuard exceptional efficiency and effectiveness have been extensively validated across various modalities (image, text, and audio) and tasks (classification and regression), ranging from diverse model architectures against a wide range of AE attacks and backdoor attacks, including challenging partial backdoors and dynamic triggers. When compared to SOTA methods, including ContraNet (NDSS 22) specific for AE detection and TED (IEEE SP 24) specific for backdoor detection, UniGuard consistently demonstrates superior performance, even when matched against each method's strengths in addressing their respective threats-each SOTA fails to parts of attack strategies while UniGuard succeeds for all.
LGFeb 10, 2022
PPA: Preference Profiling Attack Against Federated LearningChunyi Zhou, Yansong Gao, Anmin Fu et al.
Federated learning (FL) trains a global model across a number of decentralized users, each with a local dataset. Compared to traditional centralized learning, FL does not require direct access to local datasets and thus aims to mitigate data privacy concerns. However, data privacy leakage in FL still exists due to inference attacks, including membership inference, property inference, and data inversion. In this work, we propose a new type of privacy inference attack, coined Preference Profiling Attack (PPA), that accurately profiles the private preferences of a local user, e.g., most liked (disliked) items from the client's online shopping and most common expressions from the user's selfies. In general, PPA can profile top-k (i.e., k = 1, 2, 3 and k = 1 in particular) preferences contingent on the local client (user)'s characteristics. Our key insight is that the gradient variation of a local user's model has a distinguishable sensitivity to the sample proportion of a given class, especially the majority (minority) class. By observing a user model's gradient sensitivity to a class, PPA can profile the sample proportion of the class in the user's local dataset, and thus the user's preference of the class is exposed. The inherent statistical heterogeneity of FL further facilitates PPA. We have extensively evaluated the PPA's effectiveness using four datasets (MNIST, CIFAR10, RAF-DB and Products-10K). Our results show that PPA achieves 90% and 98% top-1 attack accuracy to the MNIST and CIFAR10, respectively. More importantly, in real-world commercial scenarios of shopping (i.e., Products-10K) and social network (i.e., RAF-DB), PPA gains a top-1 attack accuracy of 78% in the former case to infer the most ordered items (i.e., as a commercial competitor), and 88% in the latter case to infer a victim user's most often facial expressions, e.g., disgusted.
CVJan 21, 2022
Dangerous Cloaking: Natural Trigger based Backdoor Attacks on Object Detectors in the Physical WorldHua Ma, Yinshan Li, Yansong Gao et al.
Deep learning models have been shown to be vulnerable to recent backdoor attacks. A backdoored model behaves normally for inputs containing no attacker-secretly-chosen trigger and maliciously for inputs with the trigger. To date, backdoor attacks and countermeasures mainly focus on image classification tasks. And most of them are implemented in the digital world with digital triggers. Besides the classification tasks, object detection systems are also considered as one of the basic foundations of computer vision tasks. However, there is no investigation and understanding of the backdoor vulnerability of the object detector, even in the digital world with digital triggers. For the first time, this work demonstrates that existing object detectors are inherently susceptible to physical backdoor attacks. We use a natural T-shirt bought from a market as a trigger to enable the cloaking effect--the person bounding-box disappears in front of the object detector. We show that such a backdoor can be implanted from two exploitable attack scenarios into the object detector, which is outsourced or fine-tuned through a pretrained model. We have extensively evaluated three popular object detection algorithms: anchor-based Yolo-V3, Yolo-V4, and anchor-free CenterNet. Building upon 19 videos shot in real-world scenes, we confirm that the backdoor attack is robust against various factors: movement, distance, angle, non-rigid deformation, and lighting. Specifically, the attack success rate (ASR) in most videos is 100% or close to it, while the clean data accuracy of the backdoored model is the same as its clean counterpart. The latter implies that it is infeasible to detect the backdoor behavior merely through a validation set. The averaged ASR still remains sufficiently high to be 78% in the transfer learning attack scenarios evaluated on CenterNet. See the demo video on https://youtu.be/Q3HOF4OobbY.
CRNov 22, 2021
NTD: Non-Transferability Enabled Backdoor DetectionYinshan Li, Hua Ma, Zhi Zhang et al.
A backdoor deep learning (DL) model behaves normally upon clean inputs but misbehaves upon trigger inputs as the backdoor attacker desires, posing severe consequences to DL model deployments. State-of-the-art defenses are either limited to specific backdoor attacks (source-agnostic attacks) or non-user-friendly in that machine learning (ML) expertise or expensive computing resources are required. This work observes that all existing backdoor attacks have an inevitable intrinsic weakness, non-transferability, that is, a trigger input hijacks a backdoored model but cannot be effective to another model that has not been implanted with the same backdoor. With this key observation, we propose non-transferability enabled backdoor detection (NTD) to identify trigger inputs for a model-under-test (MUT) during run-time.Specifically, NTD allows a potentially backdoored MUT to predict a class for an input. In the meantime, NTD leverages a feature extractor (FE) to extract feature vectors for the input and a group of samples randomly picked from its predicted class, and then compares similarity between the input and the samples in the FE's latent space. If the similarity is low, the input is an adversarial trigger input; otherwise, benign. The FE is a free pre-trained model privately reserved from open platforms. As the FE and MUT are from different sources, the attacker is very unlikely to insert the same backdoor into both of them. Because of non-transferability, a trigger effect that does work on the MUT cannot be transferred to the FE, making NTD effective against different types of backdoor attacks. We evaluate NTD on three popular customized tasks such as face recognition, traffic sign recognition and general animal classification, results of which affirm that NDT has high effectiveness (low false acceptance rate) and usability (low false rejection rate) with low detection latency.
CROct 3, 2021
Design and Evaluate Recomposited OR-AND-XOR-PUFJianrong Yao, Lihui Pang, Zhi Zhang et al.
Physical Unclonable Function (PUF) is a hardware security primitive with a desirable feature of low-cost. Based on the space of challenge-response pairs (CRPs), it has two categories:weak PUF and strong PUF. Though designing a reliable and secure lightweight strong PUF is challenging, there is continuing efforts to fulfill this gap due to wide range of applications enabled by strong PUF. It was prospected that the combination of MAX and MIN bit-wise operation is promising for improving the modeling resilience when MAX and MIN are employed in the PUF recomposition. The main rationale lies on the fact that each bit-wise might be mainly vulnerable to one specific type of modeling attack, combining them can have an improved holistic resilience. This work is to first evaluate the main PUF performance, in particular,uniformity and reliability of the OR-AND-XOR-PUF(OAX-PUF)-(x, y, z)-OAX-PUF. Compared with the most used l-XOR-PUF, the (x, y, z)-OAX-PUF eventually exhibits better reliability given l=x+y+z without degrading the uniformity retaining to be 50%. We further examine the modeling resilience of the (x, y, z)-OAX-PUF with four powerful attacking strategies to date, which are Logistic Regression (LR) attack, reliability assisted CMA-ES attack, multilayer perceptron (MLP) attack, and the most recent hybrid LR-reliability attack. In comparison with the XOR-APUF, the OAX-APUF successfully defeats the CAM-ES attack. However, it shows no notable modeling accuracy drop against other three attacks, though the attacking times have been greatly prolonged to LR and hybrid LR-reliability attacks. Overall, the OAX recomposition could be an alternative lightweight recomposition method compared to XOR towards constructing strong PUFs if the underlying PUF, e.g., FF-APUF, has exhibited improved resilience to modeling attack, because the OAX incurs smaller reliability degradation compared to XOR.
CRAug 20, 2021
Quantization Backdoors to Deep Learning Commercial FrameworksHua Ma, Huming Qiu, Yansong Gao et al.
Currently, there is a burgeoning demand for deploying deep learning (DL) models on ubiquitous edge Internet of Things (IoT) devices attributed to their low latency and high privacy preservation. However, DL models are often large in size and require large-scale computation, which prevents them from being placed directly onto IoT devices, where resources are constrained and 32-bit floating-point (float-32) operations are unavailable. Commercial framework (i.e., a set of toolkits) empowered model quantization is a pragmatic solution that enables DL deployment on mobile devices and embedded systems by effortlessly post-quantizing a large high-precision model (e.g., float-32) into a small low-precision model (e.g., int-8) while retaining the model inference accuracy. However, their usability might be threatened by security vulnerabilities. This work reveals that the standard quantization toolkits can be abused to activate a backdoor. We demonstrate that a full-precision backdoored model which does not have any backdoor effect in the presence of a trigger -- as the backdoor is dormant -- can be activated by the default i) TensorFlow-Lite (TFLite) quantization, the only product-ready quantization framework to date, and ii) the beta released PyTorch Mobile framework. When each of the float-32 models is converted into an int-8 format model through the standard TFLite or Pytorch Mobile framework's post-training quantization, the backdoor is activated in the quantized model, which shows a stable attack success rate close to 100% upon inputs with the trigger, while it behaves normally upon non-trigger inputs. This work highlights that a stealthy security threat occurs when an end user utilizes the on-device post-training model quantization frameworks, informing security researchers of cross-platform overhaul of DL models post quantization even if these models pass front-end backdoor inspections.
CRMay 9, 2021
RBNN: Memory-Efficient Reconfigurable Deep Binary Neural Network with IP Protection for Internet of ThingsHuming Qiu, Hua Ma, Zhi Zhang et al.
Though deep neural network models exhibit outstanding performance for various applications, their large model size and extensive floating-point operations render deployment on mobile computing platforms a major challenge, and, in particular, on Internet of Things devices. One appealing solution is model quantization that reduces the model size and uses integer operations commonly supported by microcontrollers . To this end, a 1-bit quantized DNN model or deep binary neural network maximizes the memory efficiency, where each parameter in a BNN model has only 1-bit. In this paper, we propose a reconfigurable BNN (RBNN) to further amplify the memory efficiency for resource-constrained IoT devices. Generally, the RBNN can be reconfigured on demand to achieve any one of M (M>1) distinct tasks with the same parameter set, thus only a single task determines the memory requirements. In other words, the memory utilization is improved by times M. Our extensive experiments corroborate that up to seven commonly used tasks can co-exist (the value of M can be larger). These tasks with a varying number of classes have no or negligible accuracy drop-off on three binarized popular DNN architectures including VGG, ResNet, and ReActNet. The tasks span across different domains, e.g., computer vision and audio domains validated herein, with the prerequisite that the model architecture can serve those cross-domain tasks. To protect the intellectual property of an RBNN model, the reconfiguration can be controlled by both a user key and a device-unique root key generated by the intrinsic hardware fingerprint. By doing so, an RBNN model can only be used per paid user per authorized device, thus benefiting both the user and the model provider.
CRJul 27, 2020
VFL: A Verifiable Federated Learning with Privacy-Preserving for Big Data in Industrial IoTAnmin Fu, Xianglong Zhang, Naixue Xiong et al.
Due to the strong analytical ability of big data, deep learning has been widely applied to train the collected data in industrial IoT. However, for privacy issues, traditional data-gathering centralized learning is not applicable to industrial scenarios sensitive to training sets. Recently, federated learning has received widespread attention, since it trains a model by only relying on gradient aggregation without accessing training sets. But existing researches reveal that the shared gradient still retains the sensitive information of the training set. Even worse, a malicious aggregation server may return forged aggregated gradients. In this paper, we propose the VFL, verifiable federated learning with privacy-preserving for big data in industrial IoT. Specifically, we use Lagrange interpolation to elaborately set interpolation points for verifying the correctness of the aggregated gradients. Compared with existing schemes, the verification overhead of VFL remains constant regardless of the number of participants. Moreover, we employ the blinding technology to protect the privacy of the gradients submitted by the participants. If no more than n-2 of n participants collude with the aggregation server, VFL could guarantee the encrypted gradients of other participants not being inverted. Experimental evaluations corroborate the practical performance of the presented VFL framework with high accuracy and efficiency.
CRJul 21, 2020
Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive ReviewYansong Gao, Bao Gia Doan, Zhi Zhang et al.
This work provides the community with a timely comprehensive review of backdoor attacks and countermeasures on deep learning. According to the attacker's capability and affected stage of the machine learning pipeline, the attack surfaces are recognized to be wide and then formalized into six categorizations: code poisoning, outsourcing, pretrained, data collection, collaborative learning and post-deployment. Accordingly, attacks under each categorization are combed. The countermeasures are categorized into four general classes: blind backdoor removal, offline backdoor inspection, online backdoor inspection, and post backdoor removal. Accordingly, we review countermeasures, and compare and analyze their advantages and disadvantages. We have also reviewed the flip side of backdoor attacks, which are explored for i) protecting intellectual property of deep learning models, ii) acting as a honeypot to catch adversarial example attacks, and iii) verifying data deletion requested by the data contributor.Overall, the research on defense is far behind the attack, and there is no single defense that can prevent all types of backdoor attacks. In some cases, an attacker can intelligently bypass existing defenses with an adaptive attack. Drawing the insights from the systematic review, we also present key areas for future research on the backdoor, such as empirical security evaluations from physical trigger attacks, and in particular, more efficient and practical countermeasures are solicited.
CRMar 1, 2020
Authentication, Access Control, Privacy, Threats and Trust Management Towards Securing Fog Computing Environments: A ReviewAbdullah Al-Noman Patwary, Anmin Fu, Ranesh Kumar Naha et al.
Fog computing is an emerging computing paradigm that has come into consideration for the deployment of IoT applications amongst researchers and technology industries over the last few years. Fog is highly distributed and consists of a wide number of autonomous end devices, which contribute to the processing. However, the variety of devices offered across different users are not audited. Hence, the security of Fog devices is a major concern in the Fog computing environment. Furthermore, mitigating and preventing those security measures is a research issue. Therefore, to provide the necessary security for Fog devices, we need to understand what the security concerns are with regards to Fog. All aspects of Fog security, which have not been covered by other literature works needs to be identified and need to be aggregate all issues in Fog security. It needs to be noted that computation devices consist of many ordinary users, and are not managed by any central entity or managing body. Therefore, trust and privacy is also a key challenge to gain market adoption for Fog. To provide the required trust and privacy, we need to also focus on authentication, threats and access control mechanisms as well as techniques in Fog computing. In this paper, we perform a survey and propose a taxonomy, which presents an overview of existing security concerns in the context of the Fog computing paradigm. We discuss the Blockchain-based solutions towards a secure Fog computing environment and presented various research challenges and directions for future research.