Sencun Zhu

CR
h-index57
16papers
723citations
Novelty55%
AI Score34

16 Papers

LGMar 6, 2023
Learning to Backdoor Federated Learning

Henger Li, Chen Wu, Sencun Zhu et al.

In a federated learning (FL) system, malicious participants can easily embed backdoors into the aggregated model while maintaining the model's performance on the main task. To this end, various defenses, including training stage aggregation-based defenses and post-training mitigation defenses, have been proposed recently. While these defenses obtain reasonable performance against existing backdoor attacks, which are mainly heuristics based, we show that they are insufficient in the face of more advanced attacks. In particular, we propose a general reinforcement learning-based backdoor attack framework where the attacker first trains a (non-myopic) attack policy using a simulator built upon its local data and common knowledge on the FL system, which is then applied during actual FL training. Our attack framework is both adaptive and flexible and achieves strong attack performance and durability even under state-of-the-art defenses.

CRApr 30, 2025
How to Backdoor the Knowledge Distillation

Chen Wu, Qian Ma, Prasenjit Mitra et al.

Knowledge distillation has become a cornerstone in modern machine learning systems, celebrated for its ability to transfer knowledge from a large, complex teacher model to a more efficient student model. Traditionally, this process is regarded as secure, assuming the teacher model is clean. This belief stems from conventional backdoor attacks relying on poisoned training data with backdoor triggers and attacker-chosen labels, which are not involved in the distillation process. Instead, knowledge distillation uses the outputs of a clean teacher model to guide the student model, inherently preventing recognition or response to backdoor triggers as intended by an attacker. In this paper, we challenge this assumption by introducing a novel attack methodology that strategically poisons the distillation dataset with adversarial examples embedded with backdoor triggers. This technique allows for the stealthy compromise of the student model while maintaining the integrity of the teacher model. Our innovative approach represents the first successful exploitation of vulnerabilities within the knowledge distillation process using clean teacher models. Through extensive experiments conducted across various datasets and attack settings, we demonstrate the robustness, stealthiness, and effectiveness of our method. Our findings reveal previously unrecognized vulnerabilities and pave the way for future research aimed at securing knowledge distillation processes against backdoor attacks.

LGApr 1, 2025
Prompting Forgetting: Unlearning in GANs via Textual Guidance

Piyush Nagasubramaniam, Neeraj Karamchandani, Chen Wu et al.

State-of-the-art generative models exhibit powerful image-generation capabilities, introducing various ethical and legal challenges to service providers hosting these models. Consequently, Content Removal Techniques (CRTs) have emerged as a growing area of research to control outputs without full-scale retraining. Recent work has explored the use of Machine Unlearning in generative models to address content removal. However, the focus of such research has been on diffusion models, and unlearning in Generative Adversarial Networks (GANs) has remained largely unexplored. We address this gap by proposing Text-to-Unlearn, a novel framework that selectively unlearns concepts from pre-trained GANs using only text prompts, enabling feature unlearning, identity unlearning, and fine-grained tasks like expression and multi-attribute removal in models trained on human faces. Leveraging natural language descriptions, our approach guides the unlearning process without requiring additional datasets or supervised fine-tuning, offering a scalable and efficient solution. To evaluate its effectiveness, we introduce an automatic unlearning assessment method adapted from state-of-the-art image-text alignment metrics, providing a comprehensive analysis of the unlearning methodology. To our knowledge, Text-to-Unlearn is the first cross-modal unlearning framework for GANs, representing a flexible and efficient advancement in managing generative model behavior.

CRMay 10, 2023
HoneyIoT: Adaptive High-Interaction Honeypot for IoT Devices Through Reinforcement Learning

Chongqi Guan, Heting Liu, Guohong Cao et al.

As IoT devices are becoming widely deployed, there exist many threats to IoT-based systems due to their inherent vulnerabilities. One effective approach to improving IoT security is to deploy IoT honeypot systems, which can collect attack information and reveal the methods and strategies used by attackers. However, building high-interaction IoT honeypots is challenging due to the heterogeneity of IoT devices. Vulnerabilities in IoT devices typically depend on specific device types or firmware versions, which encourages attackers to perform pre-attack checks to gather device information before launching attacks. Moreover, conventional honeypots are easily detected because their replying logic differs from that of the IoT devices they try to mimic. To address these problems, we develop an adaptive high-interaction honeypot for IoT devices, called HoneyIoT. We first build a real device based attack trace collection system to learn how attackers interact with IoT devices. We then model the attack behavior through markov decision process and leverage reinforcement learning techniques to learn the best responses to engage attackers based on the attack trace. We also use differential analysis techniques to mutate response values in some fields to generate high-fidelity responses. HoneyIoT has been deployed on the public Internet. Experimental results show that HoneyIoT can effectively bypass the pre-attack checks and mislead the attackers into uploading malware. Furthermore, HoneyIoT is covert against widely used reconnaissance and honeypot detection tools.

LGJan 24, 2022
Federated Unlearning with Knowledge Distillation

Chen Wu, Sencun Zhu, Prasenjit Mitra

Federated Learning (FL) is designed to protect the data privacy of each client during the training process by transmitting only models instead of the original data. However, the trained model may memorize certain information about the training data. With the recent legislation on right to be forgotten, it is crucially essential for the FL model to possess the ability to forget what it has learned from each client. We propose a novel federated unlearning method to eliminate a client's contribution by subtracting the accumulated historical updates from the model and leveraging the knowledge distillation method to restore the model's performance without using any data from the clients. This method does not have any restrictions on the type of neural networks and does not rely on clients' participation, so it is practical and efficient in the FL system. We further introduce backdoor attacks in the training process to help evaluate the unlearning effect. Experiments on three canonical datasets demonstrate the effectiveness and efficiency of our method.

IRSep 8, 2021
AppQ: Warm-starting App Recommendation Based on View Graphs

Dan Su, Jiqiang Liu, Sencun Zhu et al.

Current app ranking and recommendation systems are mainly based on user-generated information, e.g., number of downloads and ratings. However, new apps often have few (or even no) user feedback, suffering from the classic cold-start problem. How to quickly identify and then recommend new apps of high quality is a challenging issue. Here, a fundamental requirement is the capability to accurately measure an app's quality based on its inborn features, rather than user-generated features. Since users obtain first-hand experience of an app by interacting with its views, we speculate that the inborn features are largely related to the visual quality of individual views in an app and the ways the views switch to one another. In this work, we propose AppQ, a novel app quality grading and recommendation system that extracts inborn features of apps based on app source code. In particular, AppQ works in parallel to perform code analysis to extract app-level features as well as dynamic analysis to capture view-level layout hierarchy and the switching among views. Each app is then expressed as an attributed view graph, which is converted into a vector and fed to classifiers for recognizing its quality classes. Our evaluation with an app dataset from Google Play reports that AppQ achieves the best performance with accuracy of 85.0\%. This shows a lot of promise to warm-start app grading and recommendation systems with AppQ.

CRSep 1, 2021
Let Your Camera See for You: A Novel Two-Factor Authentication Method against Real-Time Phishing Attacks

Yuanyi Sun, Sencun Zhu, Yao Zhao et al.

Today, two-factor authentication (2FA) is a widely implemented mechanism to counter phishing attacks. Although much effort has been investigated in 2FA, most 2FA systems are still vulnerable to carefully designed phishing attacks, and some even request special hardware, which limits their wide deployment. Recently, real-time phishing (RTP) has made the situation even worse because an adversary can effortlessly establish a phishing website replicating a target website without any background of the web page design technique. Traditional 2FA can be easily bypassed by such RTP attacks. In this work, we propose a novel 2FA system to counter RTP attacks. The main idea is to request a user to take a photo of the web browser with the domain name in the address bar as the 2nd authentication factor. The web server side extracts the domain name information based on Optical Character Recognition (OCR), and then determines if the user is visiting this website or a fake one, thus defeating the RTP attacks where an adversary must set up a fake website with a different domain. We prototyped our system and evaluated its performance in various environments. The results showed that PhotoAuth is an effective technique with good scalability. We also showed that compared to other 2FA systems, PhotoAuth has several advantages, especially no special hardware or software support is needed on the client side except a phone, making it readily deployable.

LGDec 27, 2020
Recomposition vs. Prediction: A Novel Anomaly Detection for Discrete Events Based On Autoencoder

Lun-Pin Yuan, Peng Liu, Sencun Zhu

One of the most challenging problems in the field of intrusion detection is anomaly detection for discrete event logs. While most earlier work focused on applying unsupervised learning upon engineered features, most recent work has started to resolve this challenge by applying deep learning methodology to abstraction of discrete event entries. Inspired by natural language processing, LSTM-based anomaly detection models were proposed. They try to predict upcoming events, and raise an anomaly alert when a prediction fails to meet a certain criterion. However, such a predict-next-event methodology has a fundamental limitation: event predictions may not be able to fully exploit the distinctive characteristics of sequences. This limitation leads to high false positives (FPs) and high false negatives (FNs). It is also critical to examine the structure of sequences and the bi-directional causality among individual events. To this end, we propose a new methodology: Recomposing event sequences as anomaly detection. We propose DabLog, a Deep Autoencoder-Based anomaly detection method for discrete event Logs. The fundamental difference is that, rather than predicting upcoming events, our approach determines whether a sequence is normal or abnormal by analyzing (encoding) and reconstructing (decoding) the given sequence. Our evaluation results show that our new methodology can significantly reduce the numbers of FPs and FNs, hence achieving a higher $F_1$ score.

LGDec 27, 2020
Time-Window Group-Correlation Support vs. Individual Features: A Detection of Abnormal Users

Lun-Pin Yuan, Euijin Choo, Ting Yu et al.

Autoencoder-based anomaly detection methods have been used in identifying anomalous users from large-scale enterprise logs with the assumption that adversarial activities do not follow past habitual patterns. Most existing approaches typically build models by reconstructing single-day and individual-user behaviors. However, without capturing long-term signals and group-correlation signals, the models cannot identify low-signal yet long-lasting threats, and will wrongly report many normal users as anomalies on busy days, which, in turn, lead to high false positive rate. In this paper, we propose ACOBE, an Anomaly detection method based on COmpound BEhavior, which takes into consideration long-term patterns and group behaviors. ACOBE leverages a novel behavior representation and an ensemble of deep autoencoders and produces an ordered investigation list. Our evaluation shows that ACOBE outperforms prior work by a large margin in terms of precision and recall, and our case study demonstrates that ACOBE is applicable in practice for cyberattack detection.

CROct 28, 2020
Mitigating Backdoor Attacks in Federated Learning

Chen Wu, Xian Yang, Sencun Zhu et al.

Malicious clients can attack federated learning systems using malicious data, including backdoor samples, during the training phase. The compromised global model will perform well on the validation dataset designed for the task, but a small subset of data with backdoor patterns may trigger the model to make a wrong prediction. There has been an arms race between attackers who tried to conceal attacks and defenders who tried to detect attacks during the aggregation stage of training on the server-side. In this work, we propose a new and effective method to mitigate backdoor attacks after the training phase. Specifically, we design a federated pruning method to remove redundant neurons in the network and then adjust the model's extreme weight values. Our experiments conducted on distributed Fashion-MNIST show that our method can reduce the average attack success rate from 99.7% to 1.9% with a 5.5% loss of test accuracy on the validation dataset. To minimize the pruning influence on test accuracy, we can fine-tune after pruning, and the attack success rate drops to 6.4%, with only a 1.7% loss of test accuracy. Further experiments under Distributed Backdoor Attacks on CIFAR-10 also show promising results that the average attack success rate drops more than 70% with less than 2% loss of test accuracy on the validation dataset.

CROct 21, 2020
"Are you home alone?" "Yes" Disclosing Security and Privacy Vulnerabilities in Alexa Skills

Dan Su, Jiqiang Liu, Sencun Zhu et al.

The home voice assistants such as Amazon Alexa have become increasingly popular due to many interesting voice-activated services provided through special applications called skills. These skills, though useful, have also introduced new security and privacy challenges. Prior work has verified that Alexa is vulnerable to multiple types of voice attacks, but the security and privacy risk of using skills has not been fully investigated. In this work, we study an adversary model that covers three severe privacy-related vulnerabilities, namely,over-privileged resource access, hidden code-manipulation and hidden content-manipulation. By exploiting these vulnerabilities, malicious skills can not only bypass the security tests in the vetting process, but also surreptitiously change their original functions in an attempt to steal users' personal information. What makes the situation even worse is that the attacks can be extended from virtual networks to the physical world. We systematically study the security issues from the feasibility and implementation of the attacks to the design of countermeasures. We also made a comprehensive survey study of 33,744 skills in Alex Skills Store.

CRAug 28, 2020
Toward A Network-Assisted Approach for Effective Ransomware Detection

Tianrou Xia, Yuanyi Sun, Sencun Zhu et al.

Ransomware is a kind of malware using cryptographic mechanisms to prevent victims from normal use of their computers. As a result, victims lose the access to their files and desktops unless they pay the ransom to the attackers. By the end of 2019, ransomware attack had caused more than 10 billion dollars of financial loss to enterprises and individuals. In this work, we propose Network-Assisted Approach (NAA), which contains effective local detection and network-level detection mechanisms, to help users determine whether a machine has been infected by ransomware. To evaluate its performance, we built 100 containers in Docker to simulate network scenarios. A hybrid ransomware sample which is close to real-world ransomware is deployed on stimulative infected machines. The experiment results show that our network-level detection mechanisms are separately applicable to WAN and LAN environments for ransomware detection.

CRAug 26, 2019
No Peeking through My Windows: Conserving Privacy in Personal Drones

Alem Fitwi, Yu Chen, Sencun Zhu

The drone technology has been increasingly used by many tech-savvy consumers, a number of defense companies, hobbyists and enthusiasts during the last ten years. Drones often come in various sizes and are designed for a multitude of purposes. Nowadays many people have small-sized personal drones for entertainment, filming, or transporting items from one place to another. However, personal drones lack a privacy-preserving mechanism. While in mission, drones often trespass into the personal territories of other people and capture photos or videos through windows without their knowledge and consent. They may also capture video or pictures of people walking, sitting, or doing private things within the drones' reach in clear form without their go permission. This could potentially invade people's personal privacy. This paper, therefore, proposes a lightweight privacy-preserving-by-design method that prevents drones from peeking through windows of houses and capturing people doing private things at home. It is a fast window object detection and scrambling technology built based on image-enhancing, morphological transformation, segmentation and contouring processes (MASP). Besides, a chaotic scrambling technique is incorporated into it for privacy purpose. Hence, this mechanism detects window objects in every image or frame of a real-time video and masks them chaotically to protect the privacy of people. The experimental results validated that the proposed MASP method is lightweight and suitable to be employed in drones, considered as edge devices.

CRMar 8, 2019
A Study on Smart Online Frame Forging Attacks against Video Surveillance System

Deeraj Nagothu, Jacob Schwell, Yu Chen et al.

Video Surveillance Systems (VSS) have become an essential infrastructural element of smart cities by increasing public safety and countering criminal activities. A VSS is normally deployed in a secure network to prevent access from unauthorized personnel. Compared to traditional systems that continuously record video regardless of the actions in the frame, a smart VSS has the capability of capturing video data upon motion detection or object detection, and then extracts essential information and send to users. This increasing design complexity of the surveillance system, however, also introduces new security vulnerabilities. In this work, a smart, real-time frame duplication attack is investigated. We show the feasibility of forging the video streams in real-time as the camera's surroundings change. The generated frames are compared constantly and instantly to identify changes in the pixel values that could represent motion detection or changes in light intensities outdoors. An attacker (intruder) can remotely trigger the replay of some previously duplicated video streams manually or automatically, via a special quick response (QR) code or when the face of an intruder appears in the camera field of view. A detection technique is proposed by leveraging the real-time electrical network frequency (ENF) reference database to match with the power grid frequency.

CRAug 30, 2018
Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation

Cong Liao, Haoti Zhong, Anna Squicciarini et al.

Deep learning models have consistently outperformed traditional machine learning models in various classification tasks, including image classification. As such, they have become increasingly prevalent in many real world applications including those where security is of great concern. Such popularity, however, may attract attackers to exploit the vulnerabilities of the deployed deep learning models and launch attacks against security-sensitive applications. In this paper, we focus on a specific type of data poisoning attack, which we refer to as a {\em backdoor injection attack}. The main goal of the adversary performing such attack is to generate and inject a backdoor into a deep learning model that can be triggered to recognize certain embedded patterns with a target label of the attacker's choice. Additionally, a backdoor injection attack should occur in a stealthy manner, without undermining the efficacy of the victim model. Specifically, we propose two approaches for generating a backdoor that is hardly perceptible yet effective in poisoning the model. We consider two attack settings, with backdoor injection carried out either before model training or during model updating. We carry out extensive experimental evaluations under various assumptions on the adversary model, and demonstrate that such attacks can be effective and achieve a high attack success rate (above $90\%$) at a small cost of model accuracy loss (below $1\%$) with a small injection rate (around $1\%$), even under the weakest assumption wherein the adversary has no knowledge either of the original training data or the classifier model.

CRSep 19, 2017
Keeping Context In Mind: Automating Mobile App Access Control with User Interface Inspection

Hao Fu, Zizhan Zheng, Sencun Zhu et al.

Recent studies observe that app foreground is the most striking component that influences the access control decisions in mobile platform, as users tend to deny permission requests lacking visible evidence. However, none of the existing permission models provides a systematic approach that can automatically answer the question: Is the resource access indicated by app foreground? In this work, we present the design, implementation, and evaluation of COSMOS, a context-aware mediation system that bridges the semantic gap between foreground interaction and background access, in order to protect system integrity and user privacy. Specifically, COSMOS learns from a large set of apps with similar functionalities and user interfaces to construct generic models that detect the outliers at runtime. It can be further customized to satisfy specific user privacy preference by continuously evolving with user decisions. Experiments show that COSMOS achieves both high precision and high recall in detecting malicious requests. We also demonstrate the effectiveness of COSMOS in capturing specific user preferences using the decisions collected from 24 users and illustrate that COSMOS can be easily deployed on smartphones as a real-time guard with a very low performance overhead.