Xiaoyu Cao

CR
h-index8
27papers
4,937citations
Novelty58%
AI Score52

27 Papers

CRJul 19, 2022Code
FLDetector: Defending Federated Learning Against Model Poisoning Attacks via Detecting Malicious Clients

Zaixi Zhang, Xiaoyu Cao, Jinyuan Jia et al.

Federated learning (FL) is vulnerable to model poisoning attacks, in which malicious clients corrupt the global model via sending manipulated model updates to the server. Existing defenses mainly rely on Byzantine-robust FL methods, which aim to learn an accurate global model even if some clients are malicious. However, they can only resist a small number of malicious clients in practice. It is still an open challenge how to defend against model poisoning attacks with a large number of malicious clients. Our FLDetector addresses this challenge via detecting malicious clients. FLDetector aims to detect and remove the majority of the malicious clients such that a Byzantine-robust FL method can learn an accurate global model using the remaining clients. Our key observation is that, in model poisoning attacks, the model updates from a client in multiple iterations are inconsistent. Therefore, FLDetector detects malicious clients via checking their model-updates consistency. Roughly speaking, the server predicts a client's model update in each iteration based on its historical model updates using the Cauchy mean value theorem and L-BFGS, and flags a client as malicious if the received model update from the client and the predicted model update are inconsistent in multiple iterations. Our extensive experiments on three benchmark datasets show that FLDetector can accurately detect malicious clients in multiple state-of-the-art model poisoning attacks. After removing the detected malicious clients, existing Byzantine-robust FL methods can learn accurate global models.Our code is available at https://github.com/zaixizhang/FLDetector.

CVAug 22, 2023Code
TOPIC: A Parallel Association Paradigm for Multi-Object Tracking under Complex Motions and Diverse Scenes

Xiaoyan Cao, Yiyao Zheng, Yao Yao et al.

Video data and algorithms have been driving advances in multi-object tracking (MOT). While existing MOT datasets focus on occlusion and appearance similarity, complex motion patterns are widespread yet overlooked. To address this issue, we introduce a new dataset called BEE24 to highlight complex motions. Identity association algorithms have long been the focus of MOT research. Existing trackers can be categorized into two association paradigms: single-feature paradigm (based on either motion or appearance feature) and serial paradigm (one feature serves as secondary while the other is primary). However, these paradigms are incapable of fully utilizing different features. In this paper, we propose a parallel paradigm and present the Two rOund Parallel matchIng meChanism (TOPIC) to implement it. The TOPIC leverages both motion and appearance features and can adaptively select the preferable one as the assignment metric based on motion level. Moreover, we provide an Attention-based Appearance Reconstruction Module (AARM) to reconstruct appearance feature embeddings, thus enhancing the representation of appearance features. Comprehensive experiments show that our approach achieves state-of-the-art performance on four public datasets and BEE24. Moreover, BEE24 challenges existing trackers to track multiple similar-appearing small objects with complex motions over long periods, which is critical in real-world applications such as beekeeping and drone swarm surveillance. Notably, our proposed parallel paradigm surpasses the performance of existing association paradigms by a large margin, e.g., reducing false negatives by 6% to 81% compared to the single-feature association paradigm. The introduced dataset and association paradigm in this work offer a fresh perspective for advancing the MOT field. The source code and dataset are available at https://github.com/holmescao/TOPICTrack.

CRMar 16, 2022
MPAF: Model Poisoning Attacks to Federated Learning based on Fake Clients

Xiaoyu Cao, Neil Zhenqiang Gong

Existing model poisoning attacks to federated learning assume that an attacker has access to a large fraction of compromised genuine clients. However, such assumption is not realistic in production federated learning systems that involve millions of clients. In this work, we propose the first Model Poisoning Attack based on Fake clients called MPAF. Specifically, we assume the attacker injects fake clients to a federated learning system and sends carefully crafted fake local model updates to the cloud server during training, such that the learnt global model has low accuracy for many indiscriminate test inputs. Towards this goal, our attack drags the global model towards an attacker-chosen base model that has low accuracy. Specifically, in each round of federated learning, the fake clients craft fake local model updates that point to the base model and scale them up to amplify their impact before sending them to the cloud server. Our experiments show that MPAF can significantly decrease the test accuracy of the global model, even if classical defenses and norm clipping are adopted, highlighting the need for more advanced defenses.

CROct 20, 2022
FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information

Xiaoyu Cao, Jinyuan Jia, Zaixi Zhang et al.

Federated learning is vulnerable to poisoning attacks in which malicious clients poison the global model via sending malicious model updates to the server. Existing defenses focus on preventing a small number of malicious clients from poisoning the global model via robust federated learning methods and detecting malicious clients when there are a large number of them. However, it is still an open challenge how to recover the global model from poisoning attacks after the malicious clients are detected. A naive solution is to remove the detected malicious clients and train a new global model from scratch, which incurs large cost that may be intolerable for resource-constrained clients such as smartphones and IoT devices. In this work, we propose FedRecover, which can recover an accurate global model from poisoning attacks with small cost for the clients. Our key idea is that the server estimates the clients' model updates instead of asking the clients to compute and communicate them during the recovery process. In particular, the server stores the global models and clients' model updates in each round, when training the poisoned global model. During the recovery process, the server estimates a client's model update in each round using its stored historical information. Moreover, we further optimize FedRecover to recover a more accurate global model using warm-up, periodic correction, abnormality fixing, and final tuning strategies, in which the server asks the clients to compute and communicate their exact model updates. Theoretically, we show that the global model recovered by FedRecover is close to or the same as that recovered by train-from-scratch under some assumptions. Empirically, our evaluation on four datasets, three federated learning methods, as well as untargeted and targeted poisoning attacks (e.g., backdoor attacks) shows that FedRecover is both accurate and efficient.

CROct 2, 2022
FLCert: Provably Secure Federated Learning against Poisoning Attacks

Xiaoyu Cao, Zaixi Zhang, Jinyuan Jia et al.

Due to its distributed nature, federated learning is vulnerable to poisoning attacks, in which malicious clients poison the training process via manipulating their local training data and/or local model updates sent to the cloud server, such that the poisoned global model misclassifies many indiscriminate test inputs or attacker-chosen ones. Existing defenses mainly leverage Byzantine-robust federated learning methods or detect malicious clients. However, these defenses do not have provable security guarantees against poisoning attacks and may be vulnerable to more advanced attacks. In this work, we aim to bridge the gap by proposing FLCert, an ensemble federated learning framework, that is provably secure against poisoning attacks with a bounded number of malicious clients. Our key idea is to divide the clients into groups, learn a global model for each group of clients using any existing federated learning method, and take a majority vote among the global models to classify a test input. Specifically, we consider two methods to group the clients and propose two variants of FLCert correspondingly, i.e., FLCert-P that randomly samples clients in each group, and FLCert-D that divides clients to disjoint groups deterministically. Our extensive experiments on multiple datasets show that the label predicted by our FLCert for a test input is provably unaffected by a bounded number of malicious clients, no matter what poisoning attacks they use.

CVApr 9, 2022
A dataset of ant colonies motion trajectories in indoor and outdoor scenes for social cluster behavior study

Meihong Wu, Xiaoyan Cao, Xiaoyu Cao et al.

Motion and interaction of social insects (such as ants) have been studied by many researchers to understand the clustering mechanism. Most studies in the field of ant behavior have only focused on indoor environments, while outdoor environments are still underexplored. In this paper, we collect 10 videos of ant colonies from different indoor and outdoor scenes. And we develop an image sequence marking software named VisualMarkData, which enables us to provide annotations of ants in the video. In all 5354 frames, the location information and the identification number of each ant are recorded for a total of 712 ants and 114112 annotations. Moreover, we provide visual analysis tools to assess and validate the technical quality and reproducibility of our data. It is hoped that this dataset will contribute to a deeper exploration on the behavior of the ant colony.

CRAug 11, 2020Code
Intrinsic Certified Robustness of Bagging against Data Poisoning Attacks

Jinyuan Jia, Xiaoyu Cao, Neil Zhenqiang Gong

In a \emph{data poisoning attack}, an attacker modifies, deletes, and/or inserts some training examples to corrupt the learnt machine learning model. \emph{Bootstrap Aggregating (bagging)} is a well-known ensemble learning method, which trains multiple base models on random subsamples of a training dataset using a base learning algorithm and uses majority vote to predict labels of testing examples. We prove the intrinsic certified robustness of bagging against data poisoning attacks. Specifically, we show that bagging with an arbitrary base learning algorithm provably predicts the same label for a testing example when the number of modified, deleted, and/or inserted training examples is bounded by a threshold. Moreover, we show that our derived threshold is tight if no assumptions on the base learning algorithm are made. We evaluate our method on MNIST and CIFAR10. For instance, our method achieves a certified accuracy of $91.1\%$ on MNIST when arbitrarily modifying, deleting, and/or inserting 100 training examples. Code is available at: \url{https://github.com/jjy1994/BaggingCertifyDataPoisoning}.

LGDec 20, 2019Code
Certified Robustness for Top-k Predictions against Adversarial Perturbations via Randomized Smoothing

Jinyuan Jia, Xiaoyu Cao, Binghui Wang et al.

It is well-known that classifiers are vulnerable to adversarial perturbations. To defend against adversarial perturbations, various certified robustness results have been derived. However, existing certified robustnesses are limited to top-1 predictions. In many real-world applications, top-$k$ predictions are more relevant. In this work, we aim to derive certified robustness for top-$k$ predictions. In particular, our certified robustness is based on randomized smoothing, which turns any classifier to a new classifier via adding noise to an input example. We adopt randomized smoothing because it is scalable to large-scale neural networks and applicable to any classifier. We derive a tight robustness in $\ell_2$ norm for top-$k$ predictions when using randomized smoothing with Gaussian noise. We find that generalizing the certified robustness from top-1 to top-$k$ predictions faces significant technical challenges. We also empirically evaluate our method on CIFAR10 and ImageNet. For example, our method can obtain an ImageNet classifier with a certified top-5 accuracy of 62.8\% when the $\ell_2$-norms of the adversarial perturbations are less than 0.5 (=127/255). Our code is publicly available at: \url{https://github.com/jjy1994/Certify_Topk}.

IRJan 14
DSL-R1: From SQL to DSL for Training Retrieval Agents across Structured and Unstructured Data with Reinforcement Learning

Yunhai Hu, Junwei Zhou, Yumo Cao et al.

Effective retrieval in complex domains requires bridging the gap between structured metadata and unstructured content. Existing systems typically isolate these capabilities, relying on either symbolic filtering or vector similarity, failing to capture their interplay. In this work, we propose DSL-R1, a unified framework that synergizes logical reasoning with semantic matching via a novel Domain-Specific Language (DSL). By embedding vector primitives within SQL-style operators, our approach leverages the complementary strengths of symbolic precision and semantic coverage. We further introduce a reinforcement learning mechanism where rule-based execution feedback and retrieval quality rewards jointly optimize the DSL generation, balancing structural correctness and semantic alignment. Evaluations on a large-scale industrial email benchmark demonstrate that DSL-R1 achieves a +12.3% improvement in Hit@1/3, consistently outperforming decoupled baselines and establishing a robust paradigm for hybrid retrieval.

CVOct 1, 2025
TextCAM: Explaining Class Activation Map with Text

Qiming Zhao, Xingjian Li, Xiaoyu Cao et al.

Deep neural networks (DNNs) have achieved remarkable success across domains but remain difficult to interpret, limiting their trustworthiness in high-stakes applications. This paper focuses on deep vision models, for which a dominant line of explainability methods are Class Activation Mapping (CAM) and its variants working by highlighting spatial regions that drive predictions. We figure out that CAM provides little semantic insight into what attributes underlie these activations. To address this limitation, we propose TextCAM, a novel explanation framework that enriches CAM with natural languages. TextCAM combines the precise spatial localization of CAM with the semantic alignment of vision-language models (VLMs). Specifically, we derive channel-level semantic representations using CLIP embeddings and linear discriminant analysis, and aggregate them with CAM weights to produce textual descriptions of salient visual evidence. This yields explanations that jointly specify where the model attends and what visual attributes likely support its decision. We further extend TextCAM to generate feature channels into semantically coherent groups, enabling more fine-grained visual-textual explanations. Experiments on ImageNet, CLEVR, and CUB demonstrate that TextCAM produces faithful and interpretable rationales that improve human understanding, detect spurious correlations, and preserve model fidelity.

CVSep 29, 2025
Towards Foundation Models for Cryo-ET Subtomogram Analysis

Runmin Jiang, Wanyue Feng, Yuntian Yang et al. · cmu, harvard

Cryo-electron tomography (cryo-ET) enables in situ visualization of macromolecular structures, where subtomogram analysis tasks such as classification, alignment, and averaging are critical for structural determination. However, effective analysis is hindered by scarce annotations, severe noise, and poor generalization. To address these challenges, we take the first step towards foundation models for cryo-ET subtomograms. First, we introduce CryoEngine, a large-scale synthetic data generator that produces over 904k subtomograms from 452 particle classes for pretraining. Second, we design an Adaptive Phase Tokenization-enhanced Vision Transformer (APT-ViT), which incorporates adaptive phase tokenization as an equivariance-enhancing module that improves robustness to both geometric and semantic variations. Third, we introduce a Noise-Resilient Contrastive Learning (NRCL) strategy to stabilize representation learning under severe noise conditions. Evaluations across 24 synthetic and real datasets demonstrate state-of-the-art (SOTA) performance on all three major subtomogram tasks and strong generalization to unseen datasets, advancing scalable and robust subtomogram analysis in cryo-ET.

CRNov 22, 2021
Poisoning Attacks to Local Differential Privacy Protocols for Key-Value Data

Yongji Wu, Xiaoyu Cao, Jinyuan Jia et al.

Local Differential Privacy (LDP) protocols enable an untrusted server to perform privacy-preserving, federated data analytics. Various LDP protocols have been developed for different types of data such as categorical data, numerical data, and key-value data. Due to their distributed settings, LDP protocols are fundamentally vulnerable to poisoning attacks, in which fake users manipulate the server's analytics results via sending carefully crafted data to the server. However, existing poisoning attacks focused on LDP protocols for simple data types such as categorical and numerical data, leaving the security of LDP protocols for more advanced data types such as key-value data unexplored. In this work, we aim to bridge the gap by introducing novel poisoning attacks to LDP protocols for key-value data. In such a LDP protocol, a server aims to simultaneously estimate the frequency and mean value of each key among some users, each of whom possesses a set of key-value pairs. Our poisoning attacks aim to simultaneously maximize the frequencies and mean values of some attacker-chosen target keys via sending carefully crafted data from some fake users to the sever. Specifically, since our attacks have two objectives, we formulate them as a two-objective optimization problem. Moreover, we propose a method to approximately solve the two-objective optimization problem, from which we obtain the optimal crafted data the fake users should send to the server. We demonstrate the effectiveness of our attacks to three LDP protocols for key-value data both theoretically and empirically. We also explore two defenses against our attacks, which are effective in some scenarios but have limited effectiveness in other scenarios. Our results highlight the needs for new defenses against our poisoning attacks.

CVSep 13, 2021
FaceGuard: Proactive Deepfake Detection

Yuankun Yang, Chenyue Liang, Hongyu He et al.

Existing deepfake-detection methods focus on passive detection, i.e., they detect fake face images via exploiting the artifacts produced during deepfake manipulation. A key limitation of passive detection is that it cannot detect fake faces that are generated by new deepfake generation methods. In this work, we propose FaceGuard, a proactive deepfake-detection framework. FaceGuard embeds a watermark into a real face image before it is published on social media. Given a face image that claims to be an individual (e.g., Nicolas Cage), FaceGuard extracts a watermark from it and predicts the face image to be fake if the extracted watermark does not match well with the individual's ground truth one. A key component of FaceGuard is a new deep-learning-based watermarking method, which is 1) robust to normal image post-processing such as JPEG compression, Gaussian blurring, cropping, and resizing, but 2) fragile to deepfake manipulation. Our evaluation on multiple datasets shows that FaceGuard can detect deepfakes accurately and outperforms existing methods.

AIJul 6, 2021
IGrow: A Smart Agriculture Solution to Autonomous Greenhouse Control

Xiaoyan Cao, Yao Yao, Lanqing Li et al.

Agriculture is the foundation of human civilization. However, the rapid increase of the global population poses a challenge on this cornerstone by demanding more food. Modern autonomous greenhouses, equipped with sensors and actuators, provide a promising solution to the problem by empowering precise control for high-efficient food production. However, the optimal control of autonomous greenhouses is challenging, requiring decision-making based on high-dimensional sensory data, and the scaling of production is limited by the scarcity of labor capable of handling this task. With the advances of artificial intelligence (AI), the internet of things (IoT), and cloud computing technologies, we are hopeful to provide a solution to automate and smarten greenhouse control to address the above challenges. In this paper, we propose a smart agriculture solution named iGrow, for autonomous greenhouse control (AGC): (1) for the first time, we formulate the AGC problem as a Markov decision process (MDP) optimization problem; (2) we design a neural network-based simulator incorporated with the incremental mechanism to simulate the complete planting process of an autonomous greenhouse, which provides a testbed for the optimization of control strategies; (3) we propose a closed-loop bi-level optimization algorithm, which can dynamically re-optimize the greenhouse control strategy with newly observed data during real-world production. We not only conduct simulation experiments but also deploy iGrow in real scenarios, and experimental results demonstrate the effectiveness and superiority of iGrow in autonomous greenhouse simulation and optimal control. Particularly, compelling results from the tomato pilot project in real autonomous greenhouses show that our solution significantly increases crop yield (+10.15\%) and net profit (+92.70\%) with statistical significance compared to planting experts.

CRJul 5, 2021
Understanding the Security of Deepfake Detection

Xiaoyu Cao, Neil Zhenqiang Gong

Deepfakes pose growing challenges to the trust of information on the Internet. Thus, detecting deepfakes has attracted increasing attentions from both academia and industry. State-of-the-art deepfake detection methods consist of two key components, i.e., face extractor and face classifier, which extract the face region in an image and classify it to be real/fake, respectively. Existing studies mainly focused on improving the detection performance in non-adversarial settings, leaving security of deepfake detection in adversarial settings largely unexplored. In this work, we aim to bridge the gap. In particular, we perform a systematic measurement study to understand the security of the state-of-the-art deepfake detection methods in adversarial settings. We use two large-scale public deepfakes data sources including FaceForensics++ and Facebook Deepfake Detection Challenge, where the deepfakes are fake face images; and we train state-of-the-art deepfake detection methods. These detection methods can achieve 0.94--0.99 accuracies in non-adversarial settings on these datasets. However, our measurement results uncover multiple security limitations of the deepfake detection methods in adversarial settings. First, we find that an attacker can evade a face extractor, i.e., the face extractor fails to extract the correct face regions, via adding small Gaussian noise to its deepfake images. Second, we find that a face classifier trained using deepfakes generated by one method cannot detect deepfakes generated by another method, i.e., an attacker can evade detection via generating deepfakes using a new method. Third, we find that an attacker can leverage backdoor attacks developed by the adversarial machine learning community to evade a face classifier. Our results highlight that deepfake detection should consider the adversarial nature of the problem.

CRFeb 3, 2021
Provably Secure Federated Learning against Malicious Clients

Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong

Federated learning enables clients to collaboratively learn a shared global model without sharing their local training data with a cloud server. However, malicious clients can corrupt the global model to predict incorrect labels for testing examples. Existing defenses against malicious clients leverage Byzantine-robust federated learning methods. However, these methods cannot provably guarantee that the predicted label for a testing example is not affected by malicious clients. We bridge this gap via ensemble federated learning. In particular, given any base federated learning algorithm, we use the algorithm to learn multiple global models, each of which is learnt using a randomly selected subset of clients. When predicting the label of a testing example, we take majority vote among the global models. We show that our ensemble federated learning with any base federated learning algorithm is provably secure against malicious clients. Specifically, the label predicted by our ensemble global model for a testing example is provably not affected by a bounded number of malicious clients. Moreover, we show that our derived bound is tight. We evaluate our method on MNIST and Human Activity Recognition datasets. For instance, our method can achieve a certified accuracy of 88% on MNIST when 20 out of 1,000 clients are malicious.

CRDec 27, 2020
FLTrust: Byzantine-robust Federated Learning via Trust Bootstrapping

Xiaoyu Cao, Minghong Fang, Jia Liu et al.

Byzantine-robust federated learning aims to enable a service provider to learn an accurate global model when a bounded number of clients are malicious. The key idea of existing Byzantine-robust federated learning methods is that the service provider performs statistical analysis among the clients' local model updates and removes suspicious ones, before aggregating them to update the global model. However, malicious clients can still corrupt the global models in these methods via sending carefully crafted local model updates to the service provider. The fundamental reason is that there is no root of trust in existing federated learning methods. In this work, we bridge the gap via proposing FLTrust, a new federated learning method in which the service provider itself bootstraps trust. In particular, the service provider itself collects a clean small training dataset (called root dataset) for the learning task and the service provider maintains a model (called server model) based on it to bootstrap trust. In each iteration, the service provider first assigns a trust score to each local model update from the clients, where a local model update has a lower trust score if its direction deviates more from the direction of the server model update. Then, the service provider normalizes the magnitudes of the local model updates such that they lie in the same hyper-sphere as the server model update in the vector space. Our normalization limits the impact of malicious local model updates with large magnitudes. Finally, the service provider computes the average of the normalized local model updates weighted by their trust scores as a global model update, which is used to update the global model. Our extensive evaluations on six datasets from different domains show that our FLTrust is secure against both existing attacks and strong adaptive attacks.

CRDec 7, 2020
Certified Robustness of Nearest Neighbors against Data Poisoning and Backdoor Attacks

Jinyuan Jia, Yupei Liu, Xiaoyu Cao et al.

Data poisoning attacks and backdoor attacks aim to corrupt a machine learning classifier via modifying, adding, and/or removing some carefully selected training examples, such that the corrupted classifier makes incorrect predictions as the attacker desires. The key idea of state-of-the-art certified defenses against data poisoning attacks and backdoor attacks is to create a majority vote mechanism to predict the label of a testing example. Moreover, each voter is a base classifier trained on a subset of the training dataset. Classical simple learning algorithms such as k nearest neighbors (kNN) and radius nearest neighbors (rNN) have intrinsic majority vote mechanisms. In this work, we show that the intrinsic majority vote mechanisms in kNN and rNN already provide certified robustness guarantees against data poisoning attacks and backdoor attacks. Moreover, our evaluation results on MNIST and CIFAR10 show that the intrinsic certified robustness guarantees of kNN and rNN outperform those provided by state-of-the-art certified defenses. Our results serve as standard baselines for future certified defenses against data poisoning attacks and backdoor attacks.

CRNov 15, 2020
Almost Tight L0-norm Certified Robustness of Top-k Predictions against Adversarial Perturbations

Jinyuan Jia, Binghui Wang, Xiaoyu Cao et al.

Top-k predictions are used in many real-world applications such as machine learning as a service, recommender systems, and web searches. $\ell_0$-norm adversarial perturbation characterizes an attack that arbitrarily modifies some features of an input such that a classifier makes an incorrect prediction for the perturbed input. $\ell_0$-norm adversarial perturbation is easy to interpret and can be implemented in the physical world. Therefore, certifying robustness of top-$k$ predictions against $\ell_0$-norm adversarial perturbation is important. However, existing studies either focused on certifying $\ell_0$-norm robustness of top-$1$ predictions or $\ell_2$-norm robustness of top-$k$ predictions. In this work, we aim to bridge the gap. Our approach is based on randomized smoothing, which builds a provably robust classifier from an arbitrary classifier via randomizing an input. Our major theoretical contribution is an almost tight $\ell_0$-norm certified robustness guarantee for top-$k$ predictions. We empirically evaluate our method on CIFAR10 and ImageNet. For instance, our method can build a classifier that achieves a certified top-3 accuracy of 69.2\% on ImageNet when an attacker can arbitrarily perturb 5 pixels of a testing image.

CRAug 24, 2020
Certified Robustness of Graph Neural Networks against Adversarial Structural Perturbation

Binghui Wang, Jinyuan Jia, Xiaoyu Cao et al.

Graph neural networks (GNNs) have recently gained much attention for node and graph classification tasks on graph-structured data. However, multiple recent works showed that an attacker can easily make GNNs predict incorrectly via perturbing the graph structure, i.e., adding or deleting edges in the graph. We aim to defend against such attacks via developing certifiably robust GNNs. Specifically, we prove the certified robustness guarantee of any GNN for both node and graph classifications against structural perturbation. Moreover, we show that our certified robustness guarantee is tight. Our results are based on a recently proposed technique called randomized smoothing, which we extend to graph data. We also empirically evaluate our method for both node and graph classifications on multiple GNNs and multiple benchmark datasets. For instance, on the Cora dataset, Graph Convolutional Network with our randomized smoothing can achieve a certified accuracy of 0.49 when the attacker can arbitrarily add/delete at most 15 edges in the graph.

CRFeb 26, 2020
On Certifying Robustness against Backdoor Attacks via Randomized Smoothing

Binghui Wang, Xiaoyu Cao, Jinyuan jia et al.

Backdoor attack is a severe security threat to deep neural networks (DNNs). We envision that, like adversarial examples, there will be a cat-and-mouse game for backdoor attacks, i.e., new empirical defenses are developed to defend against backdoor attacks but they are soon broken by strong adaptive backdoor attacks. To prevent such cat-and-mouse game, we take the first step towards certified defenses against backdoor attacks. Specifically, in this work, we study the feasibility and effectiveness of certifying robustness against backdoor attacks using a recent technique called randomized smoothing. Randomized smoothing was originally developed to certify robustness against adversarial examples. We generalize randomized smoothing to defend against backdoor attacks. Our results show the theoretical feasibility of using randomized smoothing to certify robustness against backdoor attacks. However, we also find that existing randomized smoothing methods have limited effectiveness at defending against backdoor attacks, which highlight the needs of new theory and methods to certify robustness against backdoor attacks.

CRFeb 9, 2020
Certified Robustness of Community Detection against Adversarial Structural Perturbation via Randomized Smoothing

Jinyuan Jia, Binghui Wang, Xiaoyu Cao et al.

Community detection plays a key role in understanding graph structure. However, several recent studies showed that community detection is vulnerable to adversarial structural perturbation. In particular, via adding or removing a small number of carefully selected edges in a graph, an attacker can manipulate the detected communities. However, to the best of our knowledge, there are no studies on certifying robustness of community detection against such adversarial structural perturbation. In this work, we aim to bridge this gap. Specifically, we develop the first certified robustness guarantee of community detection against adversarial structural perturbation. Given an arbitrary community detection method, we build a new smoothed community detection method via randomly perturbing the graph structure. We theoretically show that the smoothed community detection method provably groups a given arbitrary set of nodes into the same community (or different communities) when the number of edges added/removed by an attacker is bounded. Moreover, we show that our certified robustness is tight. We also empirically evaluate our method on multiple real-world graphs with ground truth communities.

CRNov 26, 2019
Local Model Poisoning Attacks to Byzantine-Robust Federated Learning

Minghong Fang, Xiaoyu Cao, Jinyuan Jia et al.

In federated learning, multiple client devices jointly learn a machine learning model: each client device maintains a local model for its local training dataset, while a master device maintains a global model via aggregating the local models from the client devices. The machine learning community recently proposed several federated learning methods that were claimed to be robust against Byzantine failures (e.g., system failures, adversarial manipulations) of certain client devices. In this work, we perform the first systematic study on local model poisoning attacks to federated learning. We assume an attacker has compromised some client devices, and the attacker manipulates the local model parameters on the compromised client devices during the learning process such that the global model has a large testing error rate. We formulate our attacks as optimization problems and apply our attacks to four recent Byzantine-robust federated learning methods. Our empirical results on four real-world datasets show that our attacks can substantially increase the error rates of the models learnt by the federated learning methods that were claimed to be robust against Byzantine failures of some client devices. We generalize two defenses for data poisoning attacks to defend against our local model poisoning attacks. Our evaluation results show that one defense can effectively defend against our attacks in some cases, but the defenses are not effective enough in other cases, highlighting the need for new defenses against our local model poisoning attacks to federated learning.

CRNov 5, 2019
Data Poisoning Attacks to Local Differential Privacy Protocols

Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong

Local Differential Privacy (LDP) protocols enable an untrusted data collector to perform privacy-preserving data analytics. In particular, each user locally perturbs its data to preserve privacy before sending it to the data collector, who aggregates the perturbed data to obtain statistics of interest. In the past several years, researchers from multiple communities -- such as security, database, and theoretical computer science -- have proposed many LDP protocols. These studies mainly focused on improving the utility of the LDP protocols. However, the security of LDP protocols is largely unexplored. In this work, we aim to bridge this gap. We focus on LDP protocols for frequency estimation and heavy hitter identification, which are two basic data analytics tasks. Specifically, we show that an attacker can inject fake users into an LDP protocol and the fake users send carefully crafted data to the data collector such that the LDP protocol estimates high frequencies for arbitrary attacker-chosen items or identifies them as heavy hitters. We call our attacks data poisoning attacks. We theoretically and/or empirically show the effectiveness of our attacks. We also explore three countermeasures against our attacks. Our experimental results show that they can effectively defend against our attacks in some scenarios but have limited effectiveness in others, highlighting the needs for new defenses against our attacks.

CROct 28, 2019
IPGuard: Protecting Intellectual Property of Deep Neural Networks via Fingerprinting the Classification Boundary

Xiaoyu Cao, Jinyuan Jia, Neil Zhenqiang Gong

A deep neural network (DNN) classifier represents a model owner's intellectual property as training a DNN classifier often requires lots of resource. Watermarking was recently proposed to protect the intellectual property of DNN classifiers. However, watermarking suffers from a key limitation: it sacrifices the utility/accuracy of the model owner's classifier because it tampers the classifier's training or fine-tuning process. In this work, we propose IPGuard, the first method to protect intellectual property of DNN classifiers that provably incurs no accuracy loss for the classifiers. Our key observation is that a DNN classifier can be uniquely represented by its classification boundary. Based on this observation, IPGuard extracts some data points near the classification boundary of the model owner's classifier and uses them to fingerprint the classifier. A DNN classifier is said to be a pirated version of the model owner's classifier if they predict the same labels for most fingerprinting data points. IPGuard is qualitatively different from watermarking. Specifically, IPGuard extracts fingerprinting data points near the classification boundary of a classifier that is already trained, while watermarking embeds watermarks into a classifier during its training or fine-tuning process. We extensively evaluate IPGuard on CIFAR-10, CIFAR-100, and ImageNet datasets. Our results show that IPGuard can robustly identify post-processed versions of the model owner's classifier as pirated versions of the classifier, and IPGuard can identify classifiers, which are not the model owner's classifier nor its post-processed versions, as non-pirated versions of the classifier.

CRSep 17, 2017
Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification

Xiaoyu Cao, Neil Zhenqiang Gong

Deep neural networks (DNNs) have transformed several artificial intelligence research areas including computer vision, speech recognition, and natural language processing. However, recent studies demonstrated that DNNs are vulnerable to adversarial manipulations at testing time. Specifically, suppose we have a testing example, whose label can be correctly predicted by a DNN classifier. An attacker can add a small carefully crafted noise to the testing example such that the DNN classifier predicts an incorrect label, where the crafted testing example is called adversarial example. Such attacks are called evasion attacks. Evasion attacks are one of the biggest challenges for deploying DNNs in safety and security critical applications such as self-driving cars. In this work, we develop new methods to defend against evasion attacks. Our key observation is that adversarial examples are close to the classification boundary. Therefore, we propose region-based classification to be robust to adversarial examples. For a benign/adversarial testing example, we ensemble information in a hypercube centered at the example to predict its label. In contrast, traditional classifiers are point-based classification, i.e., given a testing example, the classifier predicts its label based on the testing example alone. Our evaluation results on MNIST and CIFAR-10 datasets demonstrate that our region-based classification can significantly mitigate evasion attacks without sacrificing classification accuracy on benign examples. Specifically, our region-based classification achieves the same classification accuracy on testing benign examples as point-based classification, but our region-based classification is significantly more robust than point-based classification to various evasion attacks.

CRApr 11, 2017
PIANO: Proximity-based User Authentication on Voice-Powered Internet-of-Things Devices

Neil Zhenqiang Gong, Altay Ozen, Yu Wu et al.

Voice is envisioned to be a popular way for humans to interact with Internet-of-Things (IoT) devices. We propose a proximity-based user authentication method (called PIANO) for access control on such voice-powered IoT devices. PIANO leverages the built-in speaker, microphone, and Bluetooth that voice-powered IoT devices often already have. Specifically, we assume that a user carries a personal voice-powered device (e.g., smartphone, smartwatch, or smartglass), which serves as the user's identity. When another voice-powered IoT device of the user requires authentication, PIANO estimates the distance between the two devices by playing and detecting certain acoustic signals; PIANO grants access if the estimated distance is no larger than a user-selected threshold. We implemented a proof-of-concept prototype of PIANO. Through theoretical and empirical evaluations, we find that PIANO is secure, reliable, personalizable, and efficient.