LGJul 18, 2024
Krait: A Backdoor Attack Against Graph Prompt TuningYing Song, Rita Singh, Balaji Palanisamy
Graph prompt tuning has emerged as a promising paradigm to effectively transfer general graph knowledge from pre-trained models to various downstream tasks, particularly in few-shot contexts. However, its susceptibility to backdoor attacks, where adversaries insert triggers to manipulate outcomes, raises a critical concern. We conduct the first study to investigate such vulnerability, revealing that backdoors can disguise benign graph prompts, thus evading detection. We introduce Krait, a novel graph prompt backdoor. Specifically, we propose a simple yet effective model-agnostic metric called label non-uniformity homophily to select poisoned candidates, significantly reducing computational complexity. To accommodate diverse attack scenarios and advanced attack types, we design three customizable trigger generation methods to craft prompts as triggers. We propose a novel centroid similarity-based loss function to optimize prompt tuning for attack effectiveness and stealthiness. Experiments on four real-world graphs demonstrate that Krait can efficiently embed triggers to merely 0.15% to 2% of training nodes, achieving high attack success rates without sacrificing clean accuracy. Notably, in one-to-one and all-to-one attacks, Krait can achieve 100% attack success rates by poisoning as few as 2 and 22 nodes, respectively. Our experiments further show that Krait remains potent across different transfer cases, attack types, and graph neural network backbones. Additionally, Krait can be successfully extended to the black-box setting, posing more severe threats. Finally, we analyze why Krait can evade both classical and state-of-the-art defenses, and provide practical insights for detecting and mitigating this class of attacks.
CRApr 22
VRSafe: A Secure Virtual Keyboard to Mitigate Keystroke Inference in Virtual RealityYijun Yuan, Na Du, Adam J. Lee et al.
Password-based authentication is one of the most commonly used methods for verifying user identities, and its widespread usage continues in virtual reality (VR) applications. As a result, various forms of attacks on password-based authentication in traditional environments such as keystroke inference and shoulder surfing, are still effective in VR applications. While keystroke inference attacks on virtual keyboards have been studied extensively, few efforts have developed an effective and cost-efficient defense strategy to mitigate keystroke inferences in VR. To address this gap, this paper presents a novel QWERTY keyboard called \textit{VRSafe} that is resilient to keystroke inference attacks. The proposed keyboard carefully introduces false positive keystrokes into the information collected by attackers during the typing process, making the inference of the original password difficult. \textit{VRSafe} also incorporates a novel malicious login detector that can effectively identify unauthorized login attempts using credentials inferred from keystroke inference attacks with high detection rate and minimal time and memory cost. The proposed design is evaluated through both simulation experiments and a real-world user study, and the results show that \textit{VRSafe} can significantly reduce the accuracy of keystroke inference attacks while incurring a modest overhead from a usability standpoint.
CRFeb 6
Taipan: A Query-free Transfer-based Multiple Sensitive Attribute Inference Attack Solely from Publicly Released GraphsYing Song, Balaji Palanisamy
Graph-structured data underpin a wide spectrum of modern applications. However, complex graph topologies and homophilic patterns can facilitate attribute inference attacks (AIAs) by enabling sensitive information leakage to propagate across local neighborhoods. Existing AIAs predominantly assume that adversaries can probe sensitive attributes through repeated model queries. Such assumptions are often impractical in real-world settings due to stringent data protection regulations, prohibitive query budgets, and heightened detection risks, especially when inferring multiple sensitive attributes. More critically, this model-centric perspective obscures a pervasive blind spot: \textbf{intrinsic multiple sensitive information leakage arising solely from publicly released graphs.} To exploit this unexplored vulnerability, we introduce a new attack paradigm and propose \textbf{Taipan, the first query-free transfer-based attack framework for multiple sensitive attribute inference attacks on graphs (G-MSAIAs).} Taipan integrates \emph{Hierarchical Attack Knowledge Routing} to capture intricate inter-attribute correlations, and \emph{Prompt-guided Attack Prototype Refinement} to mitigate negative transfer and performance degradation. We further present a systematic evaluation framework tailored to G-MSAIAs. Extensive experiments on diverse real-world graph datasets demonstrate that Taipan consistently achieves strong attack performance across same-distribution settings and heterogeneous similar- and out-of-distribution settings with mismatched feature dimensionalities, and remains effective even under rigorous differential privacy guarantees. Our findings underscore the urgent need for more robust multi-attribute privacy-preserving graph publishing methods and data-sharing practices.
LGNov 14, 2025
GraphToxin: Reconstructing Full Unlearned Graphs from Graph UnlearningYing Song, Balaji Palanisamy
Graph unlearning has emerged as a promising solution for complying with "the right to be forgotten" regulations by enabling the removal of sensitive information upon request. However, this solution is not foolproof. The involvement of multiple parties creates new attack surfaces, and residual traces of deleted data can still remain in the unlearned graph neural networks. These vulnerabilities can be exploited by attackers to recover the supposedly erased samples, thereby undermining the inherent functionality of graph unlearning. In this work, we propose GraphToxin, the first graph reconstruction attack against graph unlearning. Specifically, we introduce a novel curvature matching module to provide a fine-grained guidance for full unlearned graph recovery. We demonstrate that GraphToxin can successfully subvert the regulatory guarantees expected from graph unlearning - it can recover not only a deleted individual's information and personal links but also sensitive content from their connections, thereby posing substantially more detrimental threats. Furthermore, we extend GraphToxin to multiple node removals under both white-box and black-box setting. We highlight the necessity of a worst-case analysis and propose a comprehensive evaluation framework to systematically assess the attack performance under both random and worst-case node removals. This provides a more robust and realistic measure of the vulnerability of graph unlearning methods to graph reconstruction attacks. Our extensive experiments demonstrate the effectiveness and flexibility of GraphToxin. Notably, we show that existing defense mechanisms are largely ineffective against this attack and, in some cases, can even amplify its performance. Given the severe privacy risks posed by GraphToxin, our work underscores the urgent need for the development of more effective and robust defense strategies against this attack.
LGJan 23, 2024
MAPPING: Debiasing Graph Neural Networks for Fair Node Classification with Limited Sensitive Information LeakageYing Song, Balaji Palanisamy
Despite remarkable success in diverse web-based applications, Graph Neural Networks(GNNs) inherit and further exacerbate historical discrimination and social stereotypes, which critically hinder their deployments in high-stake domains such as online clinical diagnosis, financial crediting, etc. However, current fairness research that primarily craft on i.i.d data, cannot be trivially replicated to non-i.i.d. graph structures with topological dependence among samples. Existing fair graph learning typically favors pairwise constraints to achieve fairness but fails to cast off dimensional limitations and generalize them into multiple sensitive attributes; besides, most studies focus on in-processing techniques to enforce and calibrate fairness, constructing a model-agnostic debiasing GNN framework at the pre-processing stage to prevent downstream misuses and improve training reliability is still largely under-explored. Furthermore, previous work on GNNs tend to enhance either fairness or privacy individually but few probe into their interplays. In this paper, we propose a novel model-agnostic debiasing framework named MAPPING (\underline{M}asking \underline{A}nd \underline{P}runing and Message-\underline{P}assing train\underline{ING}) for fair node classification, in which we adopt the distance covariance($dCov$)-based fairness constraints to simultaneously reduce feature and topology biases in arbitrary dimensions, and combine them with adversarial debiasing to confine the risks of attribute inference attacks. Experiments on real-world datasets with different GNN variants demonstrate the effectiveness and flexibility of MAPPING. Our results show that MAPPING can achieve better trade-offs between utility and fairness, and mitigate privacy risks of sensitive information leakage.
CRApr 6, 2025
SolRPDS: A Dataset for Analyzing Rug Pulls in Solana Decentralized FinanceAbdulrahman Alhaidari, Bhavani Kalal, Balaji Palanisamy et al.
Rug pulls in Solana have caused significant damage to users interacting with Decentralized Finance (DeFi). A rug pull occurs when developers exploit users' trust and drain liquidity from token pools on Decentralized Exchanges (DEXs), leaving users with worthless tokens. Although rug pulls in Ethereum and Binance Smart Chain (BSC) have gained attention recently, analysis of rug pulls in Solana remains largely under-explored. In this paper, we introduce SolRPDS (Solana Rug Pull Dataset), the first public rug pull dataset derived from Solana's transactions. We examine approximately four years of DeFi data (2021-2024) that covers suspected and confirmed tokens exhibiting rug pull patterns. The dataset, derived from 3.69 billion transactions, consists of 62,895 suspicious liquidity pools. The data is annotated for inactivity states, which is a key indicator, and includes several detailed liquidity activities such as additions, removals, and last interaction as well as other attributes such as inactivity periods and withdrawn token amounts, to help identify suspicious behavior. Our preliminary analysis reveals clear distinctions between legitimate and fraudulent liquidity pools and we found that 22,195 tokens in the dataset exhibit rug pull patterns during the examined period. SolRPDS can support a wide range of future research on rug pulls including the development of data-driven and heuristic-based solutions for real-time rug pull detection and mitigation.
CROct 15, 2025
On-Chain Decentralized Learning and Cost-Effective Inference for DeFi Attack MitigationAbdulrahman Alhaidari, Balaji Palanisamy, Prashant Krishnamurthy
Billions of dollars are lost every year in DeFi platforms by transactions exploiting business logic or accounting vulnerabilities. Existing defenses focus on static code analysis, public mempool screening, attacker contract detection, or trusted off-chain monitors, none of which prevents exploits submitted through private relays or malicious contracts that execute within the same block. We present the first decentralized, fully on-chain learning framework that: (i) performs gas-prohibitive computation on Layer-2 to reduce cost, (ii) propagates verified model updates to Layer-1, and (iii) enables gas-bounded, low-latency inference inside smart contracts. A novel Proof-of-Improvement (PoIm) protocol governs the training process and verifies each decentralized micro update as a self-verifying training transaction. Updates are accepted by \textit{PoIm} only if they demonstrably improve at least one core metric (e.g., accuracy, F1-score, precision, or recall) on a public benchmark without degrading any of the other core metrics, while adversarial proposals get financially penalized through an adaptable test set for evolving threats. We develop quantization and loop-unrolling techniques that enable inference for logistic regression, SVM, MLPs, CNNs, and gated RNNs (with support for formally verified decision tree inference) within the Ethereum block gas limit, while remaining bit-exact to their off-chain counterparts, formally proven in Z3. We curate 298 unique real-world exploits (2020 - 2025) with 402 exploit transactions across eight EVM chains, collectively responsible for \$3.74 B in losses.
CRJan 30, 2021
SteemOps: Extracting and Analyzing Key Operations in Steemit Blockchain-based Social Media PlatformChao Li, Balaji Palanisamy, Runhua Xu et al.
Advancements in distributed ledger technologies are driving the rise of blockchain-based social media platforms such as Steemit, where users interact with each other in similar ways as conventional social networks. These platforms are autonomously managed by users using decentralized consensus protocols in a cryptocurrency ecosystem. The deep integration of social networks and blockchains in these platforms provides potential for numerous cross-domain research studies that are of interest to both the research communities. However, it is challenging to process and analyze large volumes of raw Steemit data as it requires specialized skills in both software engineering and blockchain systems and involves substantial efforts in extracting and filtering various types of operations. To tackle this challenge, we collect over 38 million blocks generated in Steemit during a 45 month time period from 2016/03 to 2019/11 and extract ten key types of operations performed by the users. The results generate SteemOps, a new dataset that organizes more than 900 million operations from Steemit into three sub-datasets namely (i) social-network operation dataset (SOD), (ii) witness-election operation dataset (WOD) and (iii) value-transfer operation dataset (VOD). We describe the dataset schema and its usage in detail and outline possible future research studies using SteemOps. SteemOps is designed to facilitate future research aimed at providing deeper insights on emerging blockchain-based social media platforms.
CRSep 5, 2020
NF-Crowd: Nearly-free Blockchain-based CrowdsourcingChao Li, Balaji Palanisamy, Runhua Xu et al.
Advancements in distributed ledger technologies are rapidly driving the rise of decentralized crowdsourcing systems on top of open smart contract platforms like Ethereum. While decentralized blockchain-based crowdsourcing provides numerous benefits compared to centralized solutions, current implementations of decentralized crowdsourcing suffer from fundamental scalability limitations by requiring all participants to pay a small transaction fee every time they interact with the blockchain. This increases the cost of using decentralized crowdsourcing solutions, resulting in a total payment that could be even higher than the price charged by centralized crowdsourcing platforms. This paper proposes a novel suite of protocols called NF-Crowd that resolves the scalability issue by reducing the lower bound of the total cost of a decentralized crowdsourcing project to O(1). NF-Crowd is a highly reliable solution for scaling decentralized crowdsourcing. We prove that as long as participants of a project powered by NF-Crowd are rational, the O(1) lower bound of cost could be reached regardless of the scale of the crowd. We also demonstrate that as long as at least one participant of a project powered by NF-Crowd is honest, the project cannot be aborted and the results are guaranteed to be correct. We design NF-Crowd protocols for a representative type of project named crowdsourcing contest with open community review (CC-OCR). We implement the protocols over the Ethereum official test network. Our results demonstrate that NF-Crowd protocols can reduce the cost of running a CC-OCR project to less than $2 regardless of the scale of the crowd, providing a significant cost benefit in adopting decentralized crowdsourcing solutions.
CRApr 27, 2020
EventWarden: A Decentralized Event-driven Proxy Service for Outsourcing Arbitrary Transactions in Ethereum-like BlockchainsChao Li, Balaji Palanisamy
Transactions represent a fundamental component in blockchains as they are the primary means for users to change the blockchain state. Current blockchain systems such as Bitcoin and Ethereum require users to constantly observe the state changes of interest or the events taking place in a blockchain and requires the user to explicitly release the required transactions to respond to the observed events in the blockchain. This paper proposes EventWarden, a decentralized event-driven proxy service for users to outsource transactions in Ethereum-like blockchains. EventWarden employs a novel combination of smart contracts and blockchain logs. EventWarden allows a user to create a proxy smart contract that specifies an interested event and also reserves an arbitrary transaction to release. Upon observing the occurrence of the prescribed event, anyone in the Blockchain network can call the proxy contract to earn the service fee reserved in the contract by proving to the contract that the event has been recorded into blockchain logs, which then automatically triggers the proxy contract to release the reserved transaction. We show that the reserved transaction can only get released from the proxy contract when the prescribed event has taken place. We also demonstrate that as long as a single member in the Blockchain network is incentivized by the service fee to call the proxy contract after the prescribed event has taken place, the reserved transaction is guaranteed to get released. We implement EventWarden over the Ethereum official test network. The results demonstrate that EventWarden is effective and is ready-to-use in practice.
CRFeb 6, 2020
Comparison of Decentralization in DPoS and PoW BlockchainsChao Li, Balaji Palanisamy
Decentralization is a key indicator for the evaluation of public blockchains. In the past, there have been very few studies on measuring and comparing the actual level of decentralization between Proof-of-Work (PoW) blockchains and blockchains with other consensus protocols. This paper presents a new comparison study of the level of decentralization in Bitcoin and Steem, a prominent Delegated-Proof-of-Stake (DPoS) blockchain. Our study particularly focuses on analysing the power that decides the creators of blocks in the blockchain. In Bitcoin, miners with higher computational power generate more blocks. In contrast, blocks in Steem are equally generated by witnesses while witnesses are periodically elected by stakeholders with different voting power weighted by invested stake. We analyze the process of stake-weighted election of witnesses in DPoS and measure the actual stake invested by each stakeholder in Steem. We then compute the Shannon entropy of the distribution of computational power among miners in Bitcoin and the distribution of invested stake among stakeholders in Steem. Our analyses reveal that neither Bitcoin nor Steem is dominantly better than the other with respect to decentralization. Compared with Steem, Bitcoin tends to be more decentralized among top miners but less decentralized in general. Our study is designed to provide insights into the current state of the degree of decentralization in DPoS and PoW blockchains. We believe that the methodologies and findings in this paper can facilitate future studies of decentralization in other blockchain systems employing different consensus protocols.
CRDec 17, 2019
SilentDelivery: Practical Timed-delivery of Private Information using Smart ContractsChao Li, Balaji Palanisamy
This paper proposes SilentDelivery, a secure, scalable and cost-efficient protocol for implementing timed information delivery service in a decentralized blockchain network. SilentDelivery employs a novel combination of threshold secret sharing and decentralized smart contracts. The protocol maintains shares of the decryption key of the private information of an information sender using a group of mailmen recruited in a blockchain network before the specified future time-frame and restores the information to the information recipient at the required time-frame. To tackle the key challenges that limit the security and scalability of the protocol, SilentDelivery incorporates two novel countermeasure strategies. The first strategy, namely silent recruitment, enables a mailman to get recruited by a sender silently without the knowledge of any third party. The second strategy, namely dual-mode execution, makes the protocol run in a lightweight mode by default, where the cost of running smart contracts is significantly reduced. We rigorously analyze the security of SilentDelivery and implement the protocol over the Ethereum official test network. The results demonstrate that SilentDelivery is more secure and scalable compared to the state of the art and reduces the cost of running smart contracts by 85%.
SIApr 15, 2019
Incentivized Blockchain-based Social Media Platforms: A Case Study of SteemitChao Li, Balaji Palanisamy
This paper presents an empirical analysis of Steemit, a key representative of the emerging incentivized social media platforms over Blockchains, to understand and evaluate the actual level of decentralization and the practical effects of cryptocurrency-driven reward system in these modern social media platforms. Similar to Bitcoin, Steemit is operated by a decentralized community, where 21 members are periodically elected to cooperatively operate the platform through the Delegated Proof-of-Stake (DPoS) consensus protocol. Our study performed on 539 million operations performed by 1.12 million Steemit users during the period 2016/03 to 2018/08 reveals that the actual level of decentralization in Steemit is far lower than the ideal level, indicating that the DPoS consensus protocol may not be a desirable approach for establishing a highly decentralized social media platform. In Steemit, users create contents as posts which get curated based on votes from other users. The platform periodically issues cryptocurrency as rewards to creators and curators of popular posts. Although such a reward system is originally driven by the desire to incentivize users to contribute to high-quality contents, our analysis of the underlying cryptocurrency transfer network on the blockchain reveals that more than 16% transfers of cryptocurrency in Steemit are sent to curators suspected to be bots and also finds the existence of an underlying supply network for the bots, both suggesting a significant misuse of the current reward system in Steemit. Our study is designed to provide insights on the current state of this emerging blockchain-based social media platform including the effectiveness of its design and the operation of the consensus protocols and the reward system.
CRFeb 18, 2019
Scalable and Privacy-preserving Design of On/Off-chain Smart ContractsChao Li, Balaji Palanisamy, Runhua Xu
The rise of smart contract systems such as Ethereum has resulted in a proliferation of blockchain-based decentralized applications including applications that store and manage a wide range of data. Current smart contracts are designed to be executed solely by miners and are revealed entirely on-chain, resulting in reduced scalability and privacy. In this paper, we discuss that scalability and privacy of smart contracts can be enhanced by splitting a given contract into an off-chain contract and an on-chain contract. Specifically, functions of the contract that involve high-cost computation or sensitive information can be split and included as the off-chain contract, that is signed and executed by only the interested participants. The proposed approach allows the participants to reach unanimous agreement off-chain when all of them are honest, allowing computing resources of miners to be saved and content of the off-chain contract to be hidden from the public. In case of a dispute caused by any dishonest participants, a signed copy of the off-chain contract can be revealed so that a verified instance can be created to make miners enforce the true execution result. Thus, honest participants have the ability to redress and penalize any fraudulent or dishonest behavior, which incentivizes all participants to honestly follow the agreed off-chain contract. We discuss techniques for splitting a contract into a pair of on/off-chain contracts and propose a mechanism to address the challenges of handling dishonest participants in the system. Our implementation and evaluation of the proposed approach using an example smart contract demonstrate the effectiveness of the proposed approach in Ethereum.
CRFeb 14, 2019
Decentralized Release of Self-emerging Data using Smart ContractsChao Li, Balaji Palanisamy
In the age of Big Data, releasing protected sensitive data at a future point in time is critical for various applications. Such self-emerging data release requires the data to be protected until a prescribed data release time and be automatically released to the recipient at the release time, even if the data sender goes offline. While straight-forward centralized approaches provide a basic solution to the problem, unfortunately they are limited to a single point of trust and involve a single point of control. This paper presents decentralized techniques for supporting self-emerging data using smart contracts in Ethereum blockchain networks. We design a credible and enforceable smart contract for supporting self-emerging data release. The smart contract employs a set of Ethereum peers to jointly follow the proposed timed-release service protocol allowing the participating peers to earn the remuneration paid by the service users.We model the problem as an extensive-form game with imperfect information to protect against possible adversarial attacks including some peers destroying the private data (drop attack) or secretly releasing the private data before the release time (release-ahead attack). We demonstrate the efficacy and attack-resilience of the proposed techniques through rigorous analysis and experimental evaluation. Our implementation and experimental evaluation on the Ethereum official test network demonstrate the low monetary cost and the low time overhead associated with the proposed approach and validate its guaranteed security properties.
CRFeb 14, 2019
Decentralized Privacy-preserving Timed Execution in Blockchain-based Smart Contract PlatformsChao Li, Balaji Palanisamy
In the age of Big Data, enabling task scheduling while protecting users' privacy is critical for various decentralized applications in blockchain-based smart contract platforms. Such a privacy-preserving task scheduler requires the task input data to be secretly maintained until a prescribed task execution time and be automatically recorded into the blockchain to enabling the execution of the task at the execution time, even if the user goes offline. While straight-forward centralized approaches provide a basic solution to the problem, unfortunately they are limited to a single point of trust and involve a single point of control. This paper presents decentralized techniques for supporting privacy-preserving task scheduling using smart contracts in Ethereum blockchain networks. We design a privacy-preserving task scheduling protocol that is managed by a manager smart contract. The protocol requires a user to schedule a task by deploying a proxy smart contract maintaining the non-sensitive information of the task while creating decentralized secret trust and selecting trustees from the network to maintain the sensitive information of the task. With security techniques including secret sharing and layered encryption as well as security deposit paid by trustees as economic deterrence, the protocol can protect the sensitive information against possible attacks including some trustees destroying the sensitive information (drop attack) or secretly releasing the sensitive information before the execution time (release-ahead attack). We demonstrate the attack-resilience of the proposed protocol through rigorous analysis.Our implementation and experimental evaluation on the Ethereum official test network demonstrate the low monetary cost and the low time overhead associated with the proposed approach.
CRAug 25, 2018
Privacy in Internet of Things: from Principles to TechnologiesChao Li, Balaji Palanisamy
Ubiquitous deployment of low-cost smart devices and widespread use of high-speed wireless networks have led to the rapid development of the Internet of Things (IoT). IoT embraces countless physical objects that have not been involved in the traditional Internet and enables their interaction and cooperation to provide a wide range of IoT applications. Many services in the IoT may require a comprehensive understanding and analysis of data collected through a large number of physical devices that challenges both personal information privacy and the development of IoT. Information privacy in IoT is a broad and complex concept as its understanding and perception differ among individuals and its enforcement requires efforts from both legislation as well as technologies. In this paper, we review the state-of-the-art principles of privacy laws, the architectures for IoT and the representative privacy enhancing technologies (PETs). We analyze how legal principles can be supported through a careful implementation of privacy enhancing technologies (PETs) at various layers of a layered IoT architecture model to meet the privacy requirements of the individuals interacting with IoT systems. We demonstrate how privacy legislation maps to privacy principles which in turn drives the design of necessary privacy enhancing technologies to be employed in the IoT architecture stack.