Kai Zhou

h-index17

15papers

266citations

Novelty54%

AI Score39

Ranked #80,793 of 194,257 authors (top 42%)#1,945 in CR (top 29%)

15 Papers

8.4CRJul 26, 2023Code

Coupled-Space Attacks against Random-Walk-based Anomaly Detection

Yuni Lai, Marcin Waniek, Liying Li et al.

Random Walks-based Anomaly Detection (RWAD) is commonly used to identify anomalous patterns in various applications. An intriguing characteristic of RWAD is that the input graph can either be pre-existing or constructed from raw features. Consequently, there are two potential attack surfaces against RWAD: graph-space attacks and feature-space attacks. In this paper, we explore this vulnerability by designing practical coupled-space attacks, investigating the interplay between graph-space and feature-space attacks. To this end, we conduct a thorough complexity analysis, proving that attacking RWAD is NP-hard. Then, we proceed to formulate the graph-space attack as a bi-level optimization problem and propose two strategies to solve it: alternative iteration (alterI-attack) or utilizing the closed-form solution of the random walk model (cf-attack). Finally, we utilize the results from the graph-space attacks as guidance to design more powerful feature-space attacks (i.e., graph-guided attacks). Comprehensive experiments demonstrate that our proposed attacks are effective in enabling the target nodes from RWAD with a limited attack budget. In addition, we conduct transfer attack experiments in a black-box setting, which show that our feature attack significantly decreases the anomaly scores of target nodes. Our study opens the door to studying the coupled-space attack against graph anomaly detection in which the graph space relies on the feature space.

2.0LGJul 24, 2023

Robust Graph Contrastive Learning with Information Restoration

Yulin Zhu, Xing Ai, Yevgeniy Vorobeychik et al.

The graph contrastive learning (GCL) framework has gained remarkable achievements in graph representation learning. However, similar to graph neural networks (GNNs), GCL models are susceptible to graph structural attacks. As an unsupervised method, GCL faces greater challenges in defending against adversarial attacks. Furthermore, there has been limited research on enhancing the robustness of GCL. To thoroughly explore the failure of GCL on the poisoned graphs, we investigate the detrimental effects of graph structural attacks against the GCL framework. We discover that, in addition to the conventional observation that graph structural attacks tend to connect dissimilar node pairs, these attacks also diminish the mutual information between the graph and its representations from an information-theoretical perspective, which is the cornerstone of the high-quality node embeddings for GCL. Motivated by this theoretical insight, we propose a robust graph contrastive learning framework with a learnable sanitation view that endeavors to sanitize the augmented graphs by restoring the diminished mutual information caused by the structural attacks. Additionally, we design a fully unsupervised tuning strategy to tune the hyperparameters without accessing the label information, which strictly coincides with the defender's knowledge. Extensive experiments demonstrate the effectiveness and efficiency of our proposed method compared to competitive baselines.

12.3LGDec 7, 2023

Node-aware Bi-smoothing: Certified Robustness against Graph Injection Attacks

Yuni Lai, Yulin Zhu, Bailin Pan et al.

Deep Graph Learning (DGL) has emerged as a crucial technique across various domains. However, recent studies have exposed vulnerabilities in DGL models, such as susceptibility to evasion and poisoning attacks. While empirical and provable robustness techniques have been developed to defend against graph modification attacks (GMAs), the problem of certified robustness against graph injection attacks (GIAs) remains largely unexplored. To bridge this gap, we introduce the node-aware bi-smoothing framework, which is the first certifiably robust approach for general node classification tasks against GIAs. Notably, the proposed node-aware bi-smoothing scheme is model-agnostic and is applicable for both evasion and poisoning attacks. Through rigorous theoretical analysis, we establish the certifiable conditions of our smoothing scheme. We also explore the practical implications of our node-aware bi-smoothing schemes in two contexts: as an empirical defense approach against real-world GIAs and in the context of recommendation systems. Furthermore, we extend two state-of-the-art certified robustness frameworks to address node injection attacks and compare our approach against them. Extensive evaluations demonstrate the effectiveness of our proposed certificates.

7.3CRMar 3, 2024

Collective Certified Robustness against Graph Injection Attacks

Yuni Lai, Bailin Pan, Kaihuang Chen et al.

We investigate certified robustness for GNNs under graph injection attacks. Existing research only provides sample-wise certificates by verifying each node independently, leading to very limited certifying performance. In this paper, we present the first collective certificate, which certifies a set of target nodes simultaneously. To achieve it, we formulate the problem as a binary integer quadratic constrained linear programming (BQCLP). We further develop a customized linearization technique that allows us to relax the BQCLP into linear programming (LP) that can be efficiently solved. Through comprehensive experiments, we demonstrate that our collective certification scheme significantly improves certification performance with minimal computational overhead. For instance, by solving the LP within 1 minute on the Citeseer dataset, we achieve a significant increase in the certified ratio from 0.0% to 81.2% when the injected node number is 5% of the graph size. Our step marks a crucial step towards making provable defense more practical.

5.4AIDec 12, 2023

Cost Aware Untargeted Poisoning Attack against Graph Neural Networks,

Yuwei Han, Yuni Lai, Yulin Zhu et al.

Graph Neural Networks (GNNs) have become widely used in the field of graph mining. However, these networks are vulnerable to structural perturbations. While many research efforts have focused on analyzing vulnerability through poisoning attacks, we have identified an inefficiency in current attack losses. These losses steer the attack strategy towards modifying edges targeting misclassified nodes or resilient nodes, resulting in a waste of structural adversarial perturbation. To address this issue, we propose a novel attack loss framework called the Cost Aware Poisoning Attack (CA-attack) to improve the allocation of the attack budget by dynamically considering the classification margins of nodes. Specifically, it prioritizes nodes with smaller positive margins while postponing nodes with negative margins. Our experiments demonstrate that the proposed CA-attack significantly enhances existing attack strategies

4.1LGJul 25, 2025

Multi-Grained Temporal-Spatial Graph Learning for Stable Traffic Flow Forecasting

Zhenan Lin, Yuni Lai, Wai Lun Lo et al.

Time-evolving traffic flow forecasting are playing a vital role in intelligent transportation systems and smart cities. However, the dynamic traffic flow forecasting is a highly nonlinear problem with complex temporal-spatial dependencies. Although the existing methods has provided great contributions to mine the temporal-spatial patterns in the complex traffic networks, they fail to encode the globally temporal-spatial patterns and are prone to overfit on the pre-defined geographical correlations, and thus hinder the model's robustness on the complex traffic environment. To tackle this issue, in this work, we proposed a multi-grained temporal-spatial graph learning framework to adaptively augment the globally temporal-spatial patterns obtained from a crafted graph transformer encoder with the local patterns from the graph convolution by a crafted gated fusion unit with residual connection techniques. Under these circumstances, our proposed model can mine the hidden global temporal-spatial relations between each monitor stations and balance the relative importance of local and global temporal-spatial patterns. Experiment results demonstrate the strong representation capability of our proposed method and our model consistently outperforms other strong baselines on various real-world traffic networks.

3.3LGJul 26, 2020

Robust Collective Classification against Structural Attacks

Kai Zhou, Yevgeniy Vorobeychik

Collective learning methods exploit relations among data points to enhance classification performance. However, such relations, represented as edges in the underlying graphical model, expose an extra attack surface to the adversaries. We study adversarial robustness of an important class of such graphical models, Associative Markov Networks (AMN), to structural attacks, where an attacker can modify the graph structure at test time. We formulate the task of learning a robust AMN classifier as a bi-level program, where the inner problem is a challenging non-linear integer program that computes optimal structural changes to the AMN. To address this technical challenge, we first relax the attacker problem, and then use duality to obtain a convex quadratic upper bound for the robust AMN problem. We then prove a bound on the quality of the resulting approximately optimal solutions, and experimentally demonstrate the efficacy of our approach. Finally, we apply our approach in a transductive learning setting, and show that robust AMN is much more robust than state-of-the-art deep learning methods, while sacrificing little in accuracy on non-adversarial data.

11.2AISep 3, 2019

Adversarial Robustness of Similarity-Based Link Prediction

Kai Zhou, Tomasz P. Michalak, Yevgeniy Vorobeychik

Link prediction is one of the fundamental problems in social network analysis. A common set of techniques for link prediction rely on similarity metrics which use the topology of the observed subnetwork to quantify the likelihood of unobserved links. Recently, similarity metrics for link prediction have been shown to be vulnerable to attacks whereby observations about the network are adversarially modified to hide target links. We propose a novel approach for increasing robustness of similarity-based link prediction by endowing the analyst with a restricted set of reliable queries which accurately measure the existence of queried links. The analyst aims to robustly predict a collection of possible links by optimally allocating the reliable queries. We formalize the analyst problem as a Bayesian Stackelberg game in which they first choose the reliable queries, followed by an adversary who deletes a subset of links among the remaining (unreliable) queries by the analyst. The analyst in our model is uncertain about the particular target link the adversary attempts to hide, whereas the adversary has full information about the analyst and the network. Focusing on similarity metrics using only local information, we show that the problem is NP-Hard for both players, and devise two principled and efficient approaches for solving it approximately. Extensive experiments with real and synthetic networks demonstrate the effectiveness of our approach.

11.3SISep 22, 2018

Kai Zhou, Tomasz P. Michalak, Talal Rahwan et al.

Link prediction is one of the fundamental problems in computational social science. A particularly common means to predict existence of unobserved links is via structural similarity metrics, such as the number of common neighbors; node pairs with higher similarity are thus deemed more likely to be linked. However, a number of applications of link prediction, such as predicting links in gang or terrorist networks, are adversarial, with another party incentivized to minimize its effectiveness by manipulating observed information about the network. We offer a comprehensive algorithmic investigation of the problem of attacking similarity-based link prediction through link deletion, focusing on two broad classes of such approaches, one which uses only local information about target links, and another which uses global network information. While we show several variations of the general problem to be NP-Hard for both local and global metrics, we exhibit a number of well-motivated special cases which are tractable. Additionally, we provide principled and empirically effective algorithms for the intractable cases, in some cases proving worst-case approximation guarantees.

2.3CRSep 12, 2018

Security and Privacy Enhancement for Outsourced Biometric Identification

Kai Zhou, Jian Ren

A lot of research has been focused on secure outsourcing of biometric identification in the context of cloud computing. In such schemes, both the encrypted biometric database and the identification process are outsourced to the cloud. The ultimate goal is to protect the security and privacy of the biometric database and the query templates. Security analysis shows that previous schemes suffer from the enrolment attack and unnecessarily expose more information than needed. In this paper, we propose a new secure outsourcing scheme aims at enhancing the security from these two aspects. First, besides all the attacks discussed in previous schemes, our proposed scheme is also secure against the enrolment attack. Second, we model the identification process as a fixed radius similarity query problem instead of the kNN search problem. Such a modelling is able to reduce the exposed information thus enhancing the privacy of the biometric database. Our comprehensive security and complexity analysis show that our scheme is able to enhance the security and privacy of the biometric database and query templates while maintaining the same computational savings from outsourcing.

8.6SISep 1, 2018

Attack Tolerance of Link Prediction Algorithms: How to Hide Your Relations in a Social Network

Marcin Waniek, Kai Zhou, Yevgeniy Vorobeychik et al.

Link prediction is one of the fundamental research problems in network analysis. Intuitively, it involves identifying the edges that are most likely to be added to a given network, or the edges that appear to be missing from the network when in fact they are present. Various algorithms have been proposed to solve this problem over the past decades. For all their benefits, such algorithms raise serious privacy concerns, as they could be used to expose a connection between two individuals who wish to keep their relationship private. With this in mind, we investigate the ability of such individuals to evade link prediction algorithms. More precisely, we study their ability to strategically alter their connections so as to increase the probability that some of their connections remain unidentified by link prediction algorithms. We formalize this question as an optimization problem, and prove that finding an optimal solution is NP-complete. Despite this hardness, we show that the situation is not bleak in practice. In particular, we propose two heuristics that can easily be applied by members of the general public on existing social media. We demonstrate the effectiveness of those heuristics on a wide variety of networks and against a plethora of link prediction algorithms.

5.8CRJan 8, 2018

P-MOD: Secure Privilege-Based Multilevel Organizational Data-Sharing in Cloud Computing

Ehab Zaghloul, Kai Zhou, Jian Ren

Cloud computing has changed the way enterprises store, access and share data. Data is constantly being uploaded to the cloud and shared within an organization built on a hierarchy of many different individuals that are given certain data access privileges. With more data storage needs turning over to the cloud, finding a secure and efficient data access structure has become a major research issue. With different access privileges, individuals with more privileges (at higher levels of the hierarchy) are granted access to more sensitive data than those with fewer privileges (at lower levels of the hierarchy). In this paper, a Privilege-based Multilevel Organizational Data-sharing scheme~(P-MOD) is proposed that incorporates a privilege-based access structure into an attribute-based encryption mechanism to handle these concerns. Each level of the privilege-based access structure is affiliated with an access policy that is uniquely defined by specific attributes. Data is then encrypted under each access policy at every level to grant access to specific data users based on their data access privileges. An individual ranked at a certain level can decrypt the ciphertext (at that specific level) if and only if that individual owns a correct set of attributes that can satisfy the access policy of that level. The user may also decrypt the ciphertexts at the lower levels with respect to the user's level. Security analysis shows that P-MOD is secure against adaptively chosen plaintext attack assuming the DBDH assumption holds.The comprehensive performance analysis demonstrates that P-MOD is more efficient in computational complexity and storage space than the existing schemes in secure data sharing within an organization.

6.3CRNov 14, 2017

PassBio: Privacy-Preserving User-Centric Biometric Authentication

Kai Zhou, Jian Ren

The proliferation of online biometric authentication has necessitated security requirements of biometric templates. The existing secure biometric authentication schemes feature a server-centric model, where a service provider maintains a biometric database and is fully responsible for the security of the templates. The end-users have to fully trust the server in storing, processing and managing their private templates. As a result, the end-users' templates could be compromised by outside attackers or even the service provider itself. In this paper, we propose a user-centric biometric authentication scheme (PassBio) that enables end-users to encrypt their own templates with our proposed light-weighted encryption scheme. During authentication, all the templates remain encrypted such that the server will never see them directly. However, the server is able to determine whether the distance of two encrypted templates is within a pre-defined threshold. Our security analysis shows that no critical information of the templates can be revealed under both passive and active attacks. PassBio follows a "compute-then-compare" computational model over encrypted data. More specifically, our proposed Threshold Predicate Encryption (TPE) scheme can encrypt two vectors x and y in such a manner that the inner product of x and y can be evaluated and compared to a pre-defined threshold. TPE guarantees that only the comparison result is revealed and no key information about x and y can be learned. Furthermore, we show that TPE can be utilized as a flexible building block to evaluate different distance metrics such as Hamming distance and Euclidean distance over encrypted data. Such a compute-then-compare computational model, enabled by TPE, can be widely applied in many interesting applications such as searching over encrypted data while ensuring data security and privacy.

11.9CRFeb 26, 2016

ExpSOS: Secure and Verifiable Outsourcing of Exponentiation Operations for Mobile Cloud Computing

Kai Zhou, M. H. Afifi, Jian Ren

Discrete exponential operation, such as modular exponentiation and scalar multiplication on elliptic curves, is a basic operation of many public-key cryptosystems. However, the exponential operations are considered prohibitively expensive for resource-constrained mobile devices. In this paper, we address the problem of secure outsourcing of exponentiation operations to one single untrusted server. Our proposed scheme (ExpSOS) only requires very limited number of modular multiplications at local mobile environment thus it can achieve impressive computational gain. ExpSOS also provides a secure verification scheme with probability approximately 1 to ensure that the mobile end-users can always receive valid results. The comprehensive analysis as well as the simulation results in real mobile device demonstrates that our proposed ExpSOS can significantly improve the existing schemes in efficiency, security and result verifiability. We apply ExpSOS to securely outsource several cryptographic protocols to show that ExpSOS is widely applicable to many cryptographic computations.

3.2CRNov 7, 2015

CASO: Cost-Aware Secure Outsourcing of General Computational Problems

Kai Zhou, Jian Ren

Computation outsourcing is an integral part of cloud computing. It enables end-users to outsource their computational tasks to the cloud and utilize the shared cloud resources in a pay-per-use manner. However, once the tasks are outsourced, the end-users will lose control of their data, which may result in severe security issues especially when the data is sensitive. To address this problem, secure outsourcing mechanisms have been proposed to ensure security of the end-users' outsourced data. In this paper, we investigate outsourcing of general computational problems which constitute the mathematical basics for problems emerged from various fields such as engineering and finance. To be specific, we propose affine mapping based schemes for the problem transformation and outsourcing so that the cloud is unable to learn any key information from the transformed problem. Meanwhile, the overhead for the transformation is limited to an acceptable level compared to the computational savings introduced by the outsourcing itself. Furthermore, we develop cost-aware schemes to balance the trade-offs between end-users' various security demands and computational overhead. We also propose a verification scheme to ensure that the end-users will always receive a valid solution from the cloud. Our extensive complexity and security analysis show that our proposed Cost-Aware Secure Outsourcing (CASO) scheme is both practical and effective.