Dongxi Liu

13papers

495citations

Novelty58%

AI Score28

Ranked #153,415 of 201,326 authors (top 76%)#4,354 in CR (top 60%)

13 Papers

CRMay 31, 2022

CASSOCK: Viable Backdoor Attacks against DNN in The Wall of Source-Specific Backdoor Defences

Shang Wang, Yansong Gao, Anmin Fu et al. · nvidia, utoronto

As a critical threat to deep neural networks (DNNs), backdoor attacks can be categorized into two types, i.e., source-agnostic backdoor attacks (SABAs) and source-specific backdoor attacks (SSBAs). Compared to traditional SABAs, SSBAs are more advanced in that they have superior stealthier in bypassing mainstream countermeasures that are effective against SABAs. Nonetheless, existing SSBAs suffer from two major limitations. First, they can hardly achieve a good trade-off between ASR (attack success rate) and FPR (false positive rate). Besides, they can be effectively detected by the state-of-the-art (SOTA) countermeasures (e.g., SCAn). To address the limitations above, we propose a new class of viable source-specific backdoor attacks, coined as CASSOCK. Our key insight is that trigger designs when creating poisoned data and cover data in SSBAs play a crucial role in demonstrating a viable source-specific attack, which has not been considered by existing SSBAs. With this insight, we focus on trigger transparency and content when crafting triggers for poisoned dataset where a sample has an attacker-targeted label and cover dataset where a sample has a ground-truth label. Specifically, we implement $CASSOCK_{Trans}$ and $CASSOCK_{Cont}$. While both they are orthogonal, they are complementary to each other, generating a more powerful attack, called $CASSOCK_{Comp}$, with further improved attack performance and stealthiness. We perform a comprehensive evaluation of the three $CASSOCK$-based attacks on four popular datasets and three SOTA defenses. Compared with a representative SSBA as a baseline ($SSBA_{Base}$), $CASSOCK$-based attacks have significantly advanced the attack performance, i.e., higher ASR and lower FPR with comparable CDA (clean data accuracy). Besides, $CASSOCK$-based attacks have effectively bypassed the SOTA defenses, and $SSBA_{Base}$ cannot.

CVFeb 18, 2022

Resurrecting Trust in Facial Recognition: Mitigating Backdoor Attacks in Face Recognition to Prevent Potential Privacy Breaches

Reena Zelenkova, Jack Swallow, M. A. P. Chamikara et al.

Biometric data, such as face images, are often associated with sensitive information (e.g medical, financial, personal government records). Hence, a data breach in a system storing such information can have devastating consequences. Deep learning is widely utilized for face recognition (FR); however, such models are vulnerable to backdoor attacks executed by malicious parties. Backdoor attacks cause a model to misclassify a particular class as a target class during recognition. This vulnerability can allow adversaries to gain access to highly sensitive data protected by biometric authentication measures or allow the malicious party to masquerade as an individual with higher system permissions. Such breaches pose a serious privacy threat. Previous methods integrate noise addition mechanisms into face recognition models to mitigate this issue and improve the robustness of classification against backdoor attacks. However, this can drastically affect model accuracy. We propose a novel and generalizable approach (named BA-BAM: Biometric Authentication - Backdoor Attack Mitigation), that aims to prevent backdoor attacks on face authentication deep learning models through transfer learning and selective image perturbation. The empirical evidence shows that BA-BAM is highly robust and incurs a maximal accuracy drop of 2.4%, while reducing the attack success rate to a maximum of 20%. Comparisons with existing approaches show that BA-BAM provides a more practical backdoor mitigation approach for face recognition.

CRFeb 12, 2022

Local Differential Privacy for Federated Learning

M. A. P. Chamikara, Dongxi Liu, Seyit Camtepe et al.

Advanced adversarial attacks such as membership inference and model memorization can make federated learning (FL) vulnerable and potentially leak sensitive private data. Local differentially private (LDP) approaches are gaining more popularity due to stronger privacy notions and native support for data distribution compared to other differentially private (DP) solutions. However, DP approaches assume that the FL server (that aggregates the models) is honest (run the FL protocol honestly) or semi-honest (run the FL protocol honestly while also trying to learn as much information as possible). These assumptions make such approaches unrealistic and unreliable for real-world settings. Besides, in real-world industrial environments (e.g., healthcare), the distributed entities (e.g., hospitals) are already composed of locally running machine learning models (this setting is also referred to as the cross-silo setting). Existing approaches do not provide a scalable mechanism for privacy-preserving FL to be utilized under such settings, potentially with untrusted parties. This paper proposes a new local differentially private FL (named LDPFL) protocol for industrial settings. LDPFL can run in industrial settings with untrusted entities while enforcing stronger privacy guarantees than existing approaches. LDPFL shows high FL model performance (up to 98%) under small privacy budgets (e.g., epsilon = 0.5) in comparison to existing methods.

CRMay 12, 2021

Snipuzz: Black-box Fuzzing of IoT Firmware via Message Snippet Inference

Xiaotao Feng, Ruoxi Sun, Xiaogang Zhu et al.

The proliferation of Internet of Things (IoT) devices has made people's lives more convenient, but it has also raised many security concerns. Due to the difficulty of obtaining and emulating IoT firmware, the black-box fuzzing of IoT devices has become a viable option. However, existing black-box fuzzers cannot form effective mutation optimization mechanisms to guide their testing processes, mainly due to the lack of feedback. It is difficult or even impossible to apply existing grammar-based fuzzing strategies. Therefore, an efficient fuzzing approach with syntax inference is required in the IoT fuzzing domain. To address these critical problems, we propose a novel automatic black-box fuzzing for IoT firmware, termed Snipuzz. Snipuzz runs as a client communicating with the devices and infers message snippets for mutation based on the responses. Each snippet refers to a block of consecutive bytes that reflect the approximate code coverage in fuzzing. This mutation strategy based on message snippets considerably narrows down the search space to change the probing messages. We compared Snipuzz with four state-of-the-art IoT fuzzing approaches, i.e., IoTFuzzer, BooFuzz, Doona, and Nemesys. Snipuzz not only inherits the advantages of app-based fuzzing (e.g., IoTFuzzer, but also utilizes communication responses to perform efficient mutation. Furthermore, Snipuzz is lightweight as its execution does not rely on any prerequisite operations, such as reverse engineering of apps. We also evaluated Snipuzz on 20 popular real-world IoT devices. Our results show that Snipuzz could identify 5 zero-day vulnerabilities, and 3 of them could be exposed only by Snipuzz. All the newly discovered vulnerabilities have been confirmed by their vendors.

CRJul 17, 2020

PThammer: Cross-User-Kernel-Boundary Rowhammer through Implicit Accesses

Zhi Zhang, Yueqiang Cheng, Dongxi Liu et al.

Rowhammer is a hardware vulnerability in DRAM memory, where repeated access to memory can induce bit flips in neighboring memory locations. Being a hardware vulnerability, rowhammer bypasses all of the system memory protection, allowing adversaries to compromise the integrity and confidentiality of data. Rowhammer attacks have shown to enable privilege escalation, sandbox escape, and cryptographic key disclosures. Recently, several proposals suggest exploiting the spatial proximity between the accessed memory location and the location of the bit flip for a defense against rowhammer. These all aim to deny the attacker's permission to access memory locations near sensitive data. In this paper, we question the core assumption underlying these defenses. We present PThammer, a confused-deputy attack that causes accesses to memory locations that the attacker is not allowed to access. Specifically, PThammer exploits the address translation process of modern processors, inducing the processor to generate frequent accesses to protected memory locations. We implement PThammer, demonstrating that it is a viable attack, resulting in a system compromise (e.g., kernel privilege escalation). We further evaluate the effectiveness of proposed software-only defenses showing that PThammer can overcome those.

CRJul 4, 2020

PPaaS: Privacy Preservation as a Service

Pathum Chamikara Mahawaga Arachchige, Peter Bertok, Ibrahim Khalil et al.

Personally identifiable information (PII) can find its way into cyberspace through various channels, and many potential sources can leak such information. Data sharing (e.g. cross-agency data sharing) for machine learning and analytics is one of the important components in data science. However, due to privacy concerns, data should be enforced with strong privacy guarantees before sharing. Different privacy-preserving approaches were developed for privacy preserving data sharing; however, identifying the best privacy-preservation approach for the privacy-preservation of a certain dataset is still a challenge. Different parameters can influence the efficacy of the process, such as the characteristics of the input dataset, the strength of the privacy-preservation approach, and the expected level of utility of the resulting dataset (on the corresponding data mining application such as classification). This paper presents a framework named \underline{P}rivacy \underline{P}reservation \underline{a}s \underline{a} \underline{S}ervice (PPaaS) to reduce this complexity. The proposed method employs selective privacy preservation via data perturbation and looks at different dynamics that can influence the quality of the privacy preservation of a dataset. PPaaS includes pools of data perturbation methods, and for each application and the input dataset, PPaaS selects the most suitable data perturbation approach after rigorous evaluation. It enhances the usability of privacy-preserving methods within its pool; it is a generic platform that can be used to sanitize big data in a granular, application-specific manner by employing a suitable combination of diverse privacy-preserving algorithms to provide a proper balance between privacy and utility.

CRJan 7, 2020

Towards Practical Encrypted Network Traffic Pattern Matching for Secure Middleboxes

Shangqi Lai, Xingliang Yuan, Shi-Feng Sun et al.

Network Function Virtualisation (NFV) advances the adoption of composable software middleboxes. Accordingly, cloud data centres become major NFV vendors for enterprise traffic processing. Due to the privacy concern of traffic redirection to the cloud, secure middlebox systems (e.g., BlindBox) draw much attention; they can process encrypted packets against encrypted rules directly. However, most of the existing systems supporting pattern matching based network functions require the enterprise gateway to tokenise packet payloads via sliding windows. Such tokenisation induces a considerable communication overhead, which can be over 100$\times$ to the packet size. To overcome this bottleneck, in this paper, we propose the first bandwidth-efficient encrypted pattern matching protocol for secure middleboxes. We resort to a primitive called symmetric hidden vector encryption (SHVE), and propose a variant of it, aka SHVE+, to achieve constant and moderate communication cost. To speed up, we devise encrypted filters to reduce the number of accesses to SHVE+ during matching highly. We formalise the security of our proposed protocol and conduct comprehensive evaluations over real-world rulesets and traffic dumps. The results show that our design can inspect a packet over 20k rules within 100 $μ$s. Compared to prior work, it brings a saving of $94\%$ in bandwidth consumption.

CRDec 6, 2019

TeleHammer: A Formal Model of Implicit Rowhammer

Zhi Zhang, Yueqiang Cheng, Dongxi Liu et al.

The rowhammer bug allows an attacker to gain privilege escalation or steal private data. A key requirement of all existing rowhammer attacks is that an attacker must have access to at least part of an exploitable hammer row. We refer to such rowhammer attacks as PeriHammer. The state-of-the-art software-only defenses against PeriHammer attacks is to make the exploitable hammer rows beyond the attacker's access permission. In this paper, we question the necessity of the above requirement and propose a new class of rowhammer attacks, termed as TeleHammer. It is a paradigm shift in rowhammer attacks since it crosses privilege boundary to stealthily rowhammer an inaccessible row by implicit DRAM accesses. Such accesses are achieved by abusing inherent features of modern hardware and or software. We propose a generic model to rigorously formalize the necessary conditions to initiate TeleHammer and PeriHammer, respectively. Compared to PeriHammer, TeleHammer can defeat the advanced software-only defenses, stealthy in hiding itself and hard to be mitigated. To demonstrate the practicality of TeleHammer and its advantages, we have created a TeleHammer's instance, called PThammer, which leverages the address-translation feature of modern processors. We observe that a memory access from user space can induce a load of a Level-1 page-table entry (L1PTE) from memory and thus hammer the L1PTE once, although L1PTE is not accessible to us. To achieve a high enough hammering frequency, we flush relevant TLB and cache effectively and efficiently. To this end, we demonstrate PThammer on three different test machines and show that it can cross user-kernel boundary and induce the first bit flips in L1PTEs within 15 minutes of double-sided PThammering. We have exploited PThammer to defeat advanced software-only rowhammer defenses in default system setting.

CRNov 14, 2019

Enabling Efficient Privacy-Assured Outlier Detection over Encrypted Incremental Datasets

Shangqi Lai, Xingliang Yuan, Amin Sakzad et al.

Outlier detection is widely used in practice to track the anomaly on incremental datasets such as network traffic and system logs. However, these datasets often involve sensitive information, and sharing the data to third parties for anomaly detection raises privacy concerns. In this paper, we present a privacy-preserving outlier detection protocol (PPOD) for incremental datasets. The protocol decomposes the outlier detection algorithm into several phases and recognises the necessary cryptographic operations in each phase. It realises several cryptographic modules via efficient and interchangeable protocols to support the above cryptographic operations and composes them in the overall protocol to enable outlier detection over encrypted datasets. To support efficient updates, it integrates the sliding window model to periodically evict the expired data in order to maintain a constant update time. We build a prototype of PPOD and systematically evaluates the cryptographic modules and the overall protocols under various parameter settings. Our results show that PPOD can handle encrypted incremental datasets with a moderate computation and communication cost.

CRMay 11, 2019

GraphSE$^2$: An Encrypted Graph Database for Privacy-Preserving Social Search

Shangqi Lai, Xingliang Yuan, Shi-Feng Sun et al.

In this paper, we propose GraphSE$^2$, an encrypted graph database for online social network services to address massive data breaches. GraphSE$^2$ preserves the functionality of social search, a key enabler for quality social network services, where social search queries are conducted on a large-scale social graph and meanwhile perform set and computational operations on user-generated contents. To enable efficient privacy-preserving social search, GraphSE$^2$ provides an encrypted structural data model to facilitate parallel and encrypted graph data access. It is also designed to decompose complex social search queries into atomic operations and realise them via interchangeable protocols in a fast and scalable manner. We build GraphSE$^2$ with various queries supported in the Facebook graph search engine and implement a full-fledged prototype. Extensive evaluations on Azure Cloud demonstrate that GraphSE$^2$ is practical for querying a social graph with a million of users.

CRFeb 11, 2019

A Blockchain-based Self-tallying Voting Scheme in Decentralized IoT

Yannan Li, Willy Susilo, Guomin Yang et al.

The Internet of Things (IoT) is experiencing explosive growth and has gained extensive attention from academia and industry in recent years. Most of the existing IoT infrastructures are centralized, in which the presence of a cloud server is mandatory. However, centralized frameworks suffer from the issues of unscalability and single-point-of-failure. Consequently, decentralized IoT has been proposed by taking advantage of the emerging technology of Blockchain. Voting systems are widely adopted in IoT, such as a leader election in wireless sensor networks. Self-tallying voting systems are alternatives to traditional centralized voting systems in decentralized IoT since the traditional ones are not suitable for such scenarios. Unfortunately, self-tallying voting systems inherently suffer from fairness issues, such as adaptive and abortive issues caused by malicious voters. In this paper, we introduce a framework of self-tallying systems in decentralized IoT based on Blockchain. We propose a concrete construction and prove the proposed system satisfies all the security requirements including fairness, dispute-freeness and maximal ballot secrecy. The implementations on mobile phones demonstrate the practicability of our system.

CRFeb 20, 2018

KASR: A Reliable and Practical Approach to Attack Surface Reduction of Commodity OS Kernels

Zhi Zhang, Yueqiang Cheng, Surya Nepal et al.

Commodity OS kernels have broad attack surfaces due to the large code base and the numerous features such as device drivers. For a real-world use case (e.g., an Apache Server), many kernel services are unused and only a small amount of kernel code is used. Within the used code, a certain part is invoked only at runtime while the rest are executed at startup and/or shutdown phases in the kernel's lifetime run. In this paper, we propose a reliable and practical system, named KASR, which transparently reduces attack surfaces of commodity OS kernels at runtime without requiring their source code. The KASR system, residing in a trusted hypervisor, achieves the attack surface reduction through a two-step approach: (1) reliably depriving unused code of executable permissions, and (2) transparently segmenting used code and selectively activating them. We implement a prototype of KASR on Xen-4.8.2 hypervisor and evaluate its security effectiveness on Linux kernel-4.4.0-87-generic. Our evaluation shows that KASR reduces the kernel attack surface by 64% and trims off 40% of CVE vulnerabilities. Besides, KASR successfully detects and blocks all 6 real-world kernel rootkits. We measure its performance overhead with three benchmark tools (i.e., SPECINT, httperf and bonnie++). The experimental results indicate that KASR imposes less than 1% performance overhead (compared to an unmodified Xen hypervisor) on all the benchmarks.

CRDec 14, 2014

Privacy-Preserving and Outsourced Multi-User k-Means Clustering

Bharath K. Samanthula, Fang-Yu Rao, Elisa Bertino et al.

Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Often, the entities involved in the data mining process are end-users or organizations with limited computing and storage resources. As a result, such entities may want to refrain from participating in the PPDM process. To overcome this issue and to take many other benefits of cloud computing, outsourcing PPDM tasks to the cloud environment has recently gained special attention. We consider the scenario where n entities outsource their databases (in encrypted format) to the cloud and ask the cloud to perform the clustering task on their combined data in a privacy-preserving manner. We term such a process as privacy-preserving and outsourced distributed clustering (PPODC). In this paper, we propose a novel and efficient solution to the PPODC problem based on k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers altogether through an efficient transformation technique. Our solution builds the clusters securely in an iterative fashion and returns the final cluster centers to all entities when a pre-determined termination condition holds. The proposed solution protects data confidentiality of all the participating entities under the standard semi-honest model. To the best of our knowledge, ours is the first work to discuss and propose a comprehensive solution to the PPODC problem that incurs negligible cost on the participating entities. We theoretically estimate both the computation and communication costs of the proposed protocol and also demonstrate its practical value through experiments on a real dataset.