LGJul 18, 2022
When Fairness Meets Privacy: Fair Classification with Semi-Private Sensitive AttributesCanyu Chen, Yueqing Liang, Xiongxiao Xu et al.
Machine learning models have demonstrated promising performance in many areas. However, the concerns that they can be biased against specific demographic groups hinder their adoption in high-stake applications. Thus, it is essential to ensure fairness in machine learning models. Most previous efforts require direct access to sensitive attributes for mitigating bias. Nonetheless, it is often infeasible to obtain large-scale users' sensitive attributes considering users' concerns about privacy in the data collection process. Privacy mechanisms such as local differential privacy (LDP) are widely enforced on sensitive information in the data collection stage due to legal compliance and people's increasing awareness of privacy. Therefore, a critical problem is how to make fair predictions under privacy. We study a novel and practical problem of fair classification in a semi-private setting, where most of the sensitive attributes are private and only a small amount of clean ones are available. To this end, we propose a novel framework FairSP that can achieve Fair prediction under the Semi-Private setting. First, FairSP learns to correct the noise-protected sensitive attributes by exploiting the limited clean sensitive attributes. Then, it jointly models the corrected and clean data in an adversarial way for debiasing and prediction. Theoretical analysis shows that the proposed model can ensure fairness under mild assumptions in the semi-private setting. Extensive experimental results on real-world datasets demonstrate the effectiveness of our method for making fair predictions under privacy and maintaining high accuracy.
SYMar 27, 2019
Homogeneous and Mixed Energy Communities Discovery with Spatial-Temporal Net EnergyShangyu Xie, Han Wang, Shengbin Wang et al.
Smart grid has integrated an increasing number of distributed energy resources to improve the efficiency and flexibility of power generation and consumption as well as the resilience of the power grid. The energy consumers on the power grid (e.g., households) equipped with the distributed energy resources can be considered as "microgrids" that both generate and consume electricity. In this paper, we study the energy community discovery problems which identify multiple kinds of energy communities for the microgrids to facilitate energy management (e.g., power supply adjustment, load balancing, energy sharing) on the grid, such as homogeneous energy communities (HECs), mixed energy communities (MECs), and self-sufficient energy communities (SECs). Specifically, we present efficient algorithms to discover such communities of microgrids by taking into account not only their geo-locations but also their net energy over any period. Finally, we experimentally validate the performance of the algorithms using both synthetic and real datasets.
CRJan 18, 2023
Label Inference Attack against Split Learning under Regression SettingShangyu Xie, Xin Yang, Yuanshun Yao et al.
As a crucial building block in vertical Federated Learning (vFL), Split Learning (SL) has demonstrated its practice in the two-party model training collaboration, where one party holds the features of data samples and another party holds the corresponding labels. Such method is claimed to be private considering the shared information is only the embedding vectors and gradients instead of private raw data and labels. However, some recent works have shown that the private labels could be leaked by the gradients. These existing attack only works under the classification setting where the private labels are discrete. In this work, we step further to study the leakage in the scenario of the regression model, where the private labels are continuous numbers (instead of discrete labels in classification). This makes previous attacks harder to infer the continuous labels due to the unbounded output range. To address the limitation, we propose a novel learning-based attack that integrates gradient information and extra learning regularization objectives in aspects of model training properties, which can infer the labels under regression settings effectively. The comprehensive experiments on various datasets and models have demonstrated the effectiveness of our proposed attack. We hope our work can pave the way for future analyses that make the vFL framework more secure.
CRJul 9, 2021
Universal 3-Dimensional Perturbations for Black-Box Attacks on Video Recognition SystemsShangyu Xie, Han Wang, Yu Kong et al.
Widely deployed deep neural network (DNN) models have been proven to be vulnerable to adversarial perturbations in many applications (e.g., image, audio and text classifications). To date, there are only a few adversarial perturbations proposed to deviate the DNN models in video recognition systems by simply injecting 2D perturbations into video frames. However, such attacks may overly perturb the videos without learning the spatio-temporal features (across temporal frames), which are commonly extracted by DNN models for video recognition. To our best knowledge, we propose the first black-box attack framework that generates universal 3-dimensional (U3D) perturbations to subvert a variety of video recognition systems. U3D has many advantages, such as (1) as the transfer-based attack, U3D can universally attack multiple DNN models for video recognition without accessing to the target DNN model; (2) the high transferability of U3D makes such universal black-box attack easy-to-launch, which can be further enhanced by integrating queries over the target model when necessary; (3) U3D ensures human-imperceptibility; (4) U3D can bypass the existing state-of-the-art defense schemes; (5) U3D can be efficiently generated with a few pre-learned parameters, and then immediately injected to attack real-time DNN-based video recognition systems. We have conducted extensive experiments to evaluate U3D on multiple DNN models and three large-scale video datasets. The experimental results demonstrate its superiority and practicality.
CRFeb 7, 2021
Privacy-preserving Cloud-based DNN InferenceShangyu Xie, Bingyu Liu, Yuan Hong
Deep learning as a service (DLaaS) has been intensively studied to facilitate the wider deployment of the emerging deep learning applications. However, DLaaS may compromise the privacy of both clients and cloud servers. Although some privacy preserving deep neural network (DNN) based inference techniques have been proposed by composing cryptographic primitives, the challenges on computational efficiency have not been well-addressed due to the complexity of DNN models and expensive cryptographic primitives. In this paper, we propose a novel privacy preserving cloud-based DNN inference framework (namely, "PROUD"), which greatly improves the computational efficiency. Finally, we conduct extensive experiments on two commonly-used datasets to validate both effectiveness and efficiency for the PROUD, which also outperforms the state-of-the-art techniques.
CRSep 20, 2020
R$^2$DP: A Universal and Automated Approach to Optimizing the Randomization Mechanisms of Differential Privacy for Utility Metrics with No Known Optimal DistributionsMeisam Mohammady, Shangyu Xie, Yuan Hong et al.
Differential privacy (DP) has emerged as a de facto standard privacy notion for a wide range of applications. Since the meaning of data utility in different applications may vastly differ, a key challenge is to find the optimal randomization mechanism, i.e., the distribution and its parameters, for a given utility metric. Existing works have identified the optimal distributions in some special cases, while leaving all other utility metrics (e.g., usefulness and graph distance) as open problems. Since existing works mostly rely on manual analysis to examine the search space of all distributions, it would be an expensive process to repeat such efforts for each utility metric. To address such deficiency, we propose a novel approach that can automatically optimize different utility metrics found in diverse applications under a common framework. Our key idea that, by regarding the variance of the injected noise itself as a random variable, a two-fold distribution may approximately cover the search space of all distributions. Therefore, we can automatically find distributions in this search space to optimize different utility metrics in a similar manner, simply by optimizing the parameters of the two-fold distribution. Specifically, we define a universal framework, namely, randomizing the randomization mechanism of differential privacy (R$^2$DP), and we formally analyze its privacy and utility. Our experiments show that R$^2$DP can provide better results than the baseline distribution (Laplace) for several utility metrics with no known optimal distributions, whereas our results asymptotically approach to the optimality for utility metrics having known optimal distributions. As a side benefit, the added degree of freedom introduced by the two-fold distribution allows R$^2$DP to accommodate the preferences of both data owners and recipients.
CRApr 25, 2020
Privacy Preserving Distributed Energy TradingShangyu Xie, Han Wang, Yuan Hong et al.
The smart grid incentivizes distributed agents with local generation (e.g., smart homes, and microgrids) to establish multi-agent systems for enhanced reliability and energy consumption efficiency. Distributed energy trading has emerged as one of the most important multi-agent systems on the power grid by enabling agents to sell their excessive local energy to each other or back to the grid. However, it requests all the agents to disclose their sensitive data (e.g., each agent's fine-grained local generation and demand load). In this paper, to the best of our knowledge, we propose the first privacy preserving distributed energy trading framework, Private Energy Market (PEM), in which all the agents privately compute an optimal price for their trading (ensured by a Nash Equilibrium), and allocate pairwise energy trading amounts without disclosing sensitive data (via novel cryptographic protocols). Specifically, we model the trading problem as a non-cooperative Stackelberg game for all the agents (i.e., buyers and sellers) to determine the optimal price, and then derive the pairwise trading amounts. Our PEM framework can privately perform all the computations among all the agents without a trusted third party. We prove the privacy, individual rationality, and incentive compatibility for the PEM framework. Finally, we conduct experiments on real datasets to validate the effectiveness and efficiency of the PEM.
CRSep 18, 2019
VideoDP: A Universal Platform for Video Analytics with Differential PrivacyHan Wang, Shangyu Xie, Yuan Hong
Massive amounts of video data are ubiquitously generated in personal devices and dedicated video recording facilities. Analyzing such data would be extremely beneficial in real world (e.g., urban traffic analysis, pedestrian behavior analysis, video surveillance). However, videos contain considerable sensitive information, such as human faces, identities and activities. Most of the existing video sanitization techniques simply obfuscate the video by detecting and blurring the region of interests (e.g., faces, vehicle plates, locations and timestamps) without quantifying and bounding the privacy leakage in the sanitization. In this paper, to the best of our knowledge, we propose the first differentially private video analytics platform (VideoDP) which flexibly supports different video analyses with rigorous privacy guarantee. Different from traditional noise-injection based differentially private mechanisms, given the input video, VideoDP randomly generates a utility-driven private video in which adding or removing any sensitive visual element (e.g., human, object) does not significantly affect the output video. Then, different video analyses requested by untrusted video analysts can be flexibly performed over the utility-driven video while ensuring differential privacy. Finally, we conduct experiments on real videos, and the experimental results demonstrate that our VideoDP effectively functions video analytics with good utility.
CRNov 1, 2017
Privacy Preserving and Collusion Resistant Energy SharingYuan Hong, Han Wang, Shangyu Xie et al.
Energy has been increasingly generated or collected by different entities on the power grid (e.g., universities, hospitals and householdes) via solar panels, wind turbines or local generators in the past decade. With local energy, such electricity consumers can be considered as "microgrids" which can simulataneously generate and consume energy. Some microgrids may have excessive energy that can be shared to other power consumers on the grid. To this end, all the entities have to share their local private information (e.g., their local demand, local supply and power quality data) to each other or a third-party to find and implement the optimal energy sharing solution. However, such process is constrained by privacy concerns raised by the microgrids. In this paper, we propose a privacy preserving scheme for all the microgrids which can securely implement their energy sharing against both semi-honest and colluding adversaries. The proposed approach includes two secure communication protocols that can ensure quantified privacy leakage and handle collusions.