55.5CRJun 4Code
Protecting K-Nearest Neighbor Queries from Location Inference AttacksZhiyu Sun, Jie Fu, Xinpeng Ling et al.
The k-nearest neighbor query (kNNQ) is a core component of modern location-based services (LBS) and has been widely adopted in popular features such as ``people nearby''. However, its potential privacy risks have long been overlooked. In this work, we present the first two attacks against kNNQ, namely the geometric intersection location inference attack (GI-LIA) and the zero-order optimization location inference attack (ZO-LIA), revealing the inherent location privacy risks posed by kNNQ. To mitigate these privacy risks, we further propose DPRS, a differential privacy framework for kNNQ protection. The core idea of DPRS is to incorporate a rejection sampling mechanism within a constrained perturbation interval, thereby mitigating the distance distortion caused by excessive noise injection. In addition, we design a private interval construction algorithm to construct the perturbation interval, enabling the rejection sampling mechanism to achieve a more favorable trade-off between privacy protection and query utility in kNNQ. Extensive experiments on real-world spatial datasets demonstrate that DPRS outperforms existing methods in both privacy protection and query utility. Our code is available at https://github.com/reanatom/DPRS.
95.3CRMay 18Code
Reverse-Engineering Model Editing on Language ModelsZhiyu Sun, Minrui Luo, Yu Wang et al.
Large language models (LLMs) are pretrained on corpora containing trillions of tokens and, therefore, inevitably memorize sensitive information. Locate-then-edit methods, as a mainstream paradigm of model editing, offer a promising solution by modifying model parameters without retraining. However, in this work, we reveal a critical vulnerability of this paradigm: the parameter updates inadvertently serve as a side channel, enabling attackers to recover the edited data. We propose a two-stage reverse-engineering attack named \textit{KSTER} (\textbf{K}ey\textbf{S}paceRecons\textbf{T}ruction-then-\textbf{E}ntropy\textbf{R}eduction) that leverages the low-rank structure of these updates. First, we theoretically show that the row space of the update matrix encodes a ``fingerprint" of the edited subjects, enabling accurate subject recovery via spectral analysis. Second, we introduce an entropy-based prompt recovery attack that reconstructs the semantic context of the edit. Extensive experiments on multiple LLMs demonstrate that our attacks can recover edited data with high success rates. Furthermore, we propose \textit{subspace camouflage}, a defense strategy that obfuscates the update fingerprint with semantic decoys. This approach effectively mitigates reconstruction risks without compromising editing utility. Our code is available at https://github.com/reanatom/EditingAttack.
LGAug 21, 2023Code
ALI-DPFL: Differentially Private Federated Learning with Adaptive Local IterationsXinpeng Ling, Jie Fu, Kuncan Wang et al.
Federated Learning (FL) is a distributed machine learning technique that allows model training among multiple devices or organizations by sharing training parameters instead of raw data. However, adversaries can still infer individual information through inference attacks (e.g. differential attacks) on these training parameters. As a result, Differential Privacy (DP) has been widely used in FL to prevent such attacks. We consider differentially private federated learning in a resource-constrained scenario, where both privacy budget and communication rounds are constrained. By theoretically analyzing the convergence, we can find the optimal number of local DPSGD iterations for clients between any two sequential global updates. Based on this, we design an algorithm of Differentially Private Federated Learning with Adaptive Local Iterations (ALI-DPFL). We experiment our algorithm on the MNIST, FashionMNIST and Cifar10 datasets, and demonstrate significantly better performances than previous work in the resource-constraint scenario. Code is available at https://github.com/cheng-t/ALI-DPFL.
CVMay 2, 2022
Point Cloud Compression with Sibling Context and Surface PriorsZhili Chen, Zian Qian, Sukai Wang et al.
We present a novel octree-based multi-level framework for large-scale point cloud compression, which can organize sparse and unstructured point clouds in a memory-efficient way. In this framework, we propose a new entropy model that explores the hierarchical dependency in an octree using the context of siblings' children, ancestors, and neighbors to encode the occupancy information of each non-leaf octree node into a bitstream. Moreover, we locally fit quadratic surfaces with a voxel-based geometry-aware module to provide geometric priors in entropy encoding. These strong priors empower our entropy framework to encode the octree into a more compact bitstream. In the decoding stage, we apply a two-step heuristic strategy to restore point clouds with better reconstruction quality. The quantitative evaluation shows that our method outperforms state-of-the-art baselines with a bitrate improvement of 11-16% and 12-14% on the KITTI Odometry and nuScenes datasets, respectively.
LGNov 29, 2022
Adap DP-FL: Differentially Private Federated Learning with Adaptive NoiseJie Fu, Zhili Chen, Xiao Han
Federated learning seeks to address the issue of isolated data islands by making clients disclose only their local training models. However, it was demonstrated that private information could still be inferred by analyzing local model parameters, such as deep neural network model weights. Recently, differential privacy has been applied to federated learning to protect data privacy, but the noise added may degrade the learning performance much. Typically, in previous work, training parameters were clipped equally and noises were added uniformly. The heterogeneity and convergence of training parameters were simply not considered. In this paper, we propose a differentially private scheme for federated learning with adaptive noise (Adap DP-FL). Specifically, due to the gradient heterogeneity, we conduct adaptive gradient clipping for different clients and different rounds; due to the gradient convergence, we add decreasing noises accordingly. Extensive experiments on real-world datasets demonstrate that our Adap DP-FL outperforms previous methods significantly.
LGNov 23, 2023
DPSUR: Accelerating Differentially Private Stochastic Gradient Descent Using Selective Update and ReleaseJie Fu, Qingqing Ye, Haibo Hu et al.
Machine learning models are known to memorize private data to reduce their training loss, which can be inadvertently exploited by privacy attacks such as model inversion and membership inference. To protect against these attacks, differential privacy (DP) has become the de facto standard for privacy-preserving machine learning, particularly those popular training algorithms using stochastic gradient descent, such as DPSGD. Nonetheless, DPSGD still suffers from severe utility loss due to its slow convergence. This is partially caused by the random sampling, which brings bias and variance to the gradient, and partially by the Gaussian noise, which leads to fluctuation of gradient updates. Our key idea to address these issues is to apply selective updates to the model training, while discarding those useless or even harmful updates. Motivated by this, this paper proposes DPSUR, a Differentially Private training framework based on Selective Updates and Release, where the gradient from each iteration is evaluated based on a validation test, and only those updates leading to convergence are applied to the model. As such, DPSUR ensures the training in the right direction and thus can achieve faster convergence than DPSGD. The main challenges lie in two aspects -- privacy concerns arising from gradient evaluation, and gradient selection strategy for model update. To address the challenges, DPSUR introduces a clipping strategy for update randomization and a threshold mechanism for gradient selection. Experiments conducted on MNIST, FMNIST, CIFAR-10, and IMDB datasets show that DPSUR significantly outperforms previous works in terms of convergence speed and model utility.
CVJul 22, 2024
Learning High-resolution Vector Representation from Multi-Camera Images for 3D Object DetectionZhili Chen, Shuangjie Xu, Maosheng Ye et al.
The Bird's-Eye-View (BEV) representation is a critical factor that directly impacts the 3D object detection performance, but the traditional BEV grid representation induces quadratic computational cost as the spatial resolution grows. To address this limitation, we present a new camera-based 3D object detector with high-resolution vector representation: VectorFormer. The presented high-resolution vector representation is combined with the lower-resolution BEV representation to efficiently exploit 3D geometry from multi-camera images at a high resolution through our two novel modules: vector scattering and gathering. To this end, the learned vector representation with richer scene contexts can serve as the decoding query for final predictions. We conduct extensive experiments on the nuScenes dataset and demonstrate state-of-the-art performance in NDS and inference time. Furthermore, we investigate query-BEV-based methods incorporated with our proposed vector representation and observe a consistent performance improvement.
CVNov 14, 2023
PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous DrivingZhili Chen, Maosheng Ye, Shuangjie Xu et al.
We present a new interaction mechanism of prediction and planning for end-to-end autonomous driving, called PPAD (Iterative Interaction of Prediction and Planning Autonomous Driving), which considers the timestep-wise interaction to better integrate prediction and planning. An ego vehicle performs motion planning at each timestep based on the trajectory prediction of surrounding agents (e.g., vehicles and pedestrians) and its local road conditions. Unlike existing end-to-end autonomous driving frameworks, PPAD models the interactions among ego, agents, and the dynamic environment in an autoregressive manner by interleaving the Prediction and Planning processes at every timestep, instead of a single sequential process of prediction followed by planning. Specifically, we design ego-to-agent, ego-to-map, and ego-to-BEV interaction mechanisms with hierarchical dynamic key objects attention to better model the interactions. The experiments on the nuScenes benchmark show that our approach outperforms state-of-the-art methods.
LGAug 20, 2024Code
Single-cell Curriculum Learning-based Deep Graph Embedding ClusteringHuifa Li, Jie Fu, Xinpeng Ling et al.
The swift advancement of single-cell RNA sequencing (scRNA-seq) technologies enables the investigation of cellular-level tissue heterogeneity. Cell annotation significantly contributes to the extensive downstream analysis of scRNA-seq data. However, The analysis of scRNA-seq for biological inference presents challenges owing to its intricate and indeterminate data distribution, characterized by a substantial volume and a high frequency of dropout events. Furthermore, the quality of training samples varies greatly, and the performance of the popular scRNA-seq data clustering solution GNN could be harmed by two types of low-quality training nodes: 1) nodes on the boundary; 2) nodes that contribute little additional information to the graph. To address these problems, we propose a single-cell curriculum learning-based deep graph embedding clustering (scCLG). We first propose a Chebyshev graph convolutional autoencoder with multi-criteria (ChebAE) that combines three optimization objectives, including topology reconstruction loss of cell graphs, zero-inflated negative binomial (ZINB) loss, and clustering loss, to learn cell-cell topology representation. Meanwhile, we employ a selective training strategy to train GNN based on the features and entropy of nodes and prune the difficult nodes based on the difficulty scores to keep the high-quality graph. Empirical results on a variety of gene expression datasets show that our model outperforms state-of-the-art methods. The code of scCLG will be made publicly available at https://github.com/LFD-byte/scCLG.
CRNov 14, 2022
SA-DPSGD: Differentially Private Stochastic Gradient Descent based on Simulated AnnealingJie Fu, Zhili Chen, XinPeng Ling
Differential privacy (DP) provides a formal privacy guarantee that prevents adversaries with access to machine learning models from extracting information about individual training points. Differentially private stochastic gradient descent (DPSGD) is the most popular training method with differential privacy in image recognition. However, existing DPSGD schemes lead to significant performance degradation, which prevents the application of differential privacy. In this paper, we propose a simulated annealing-based differentially private stochastic gradient descent scheme (SA-DPSGD) which accepts a candidate update with a probability that depends both on the update quality and on the number of iterations. Through this random update screening, we make the differentially private gradient descent proceed in the right direction in each iteration, and result in a more accurate model finally. In our experiments, under the same hyperparameters, our scheme achieves test accuracies 98.35%, 87.41% and 60.92% on datasets MNIST, FashionMNIST and CIFAR10, respectively, compared to the state-of-the-art result of 98.12%, 86.33% and 59.34%. Under the freely adjusted hyperparameters, our scheme achieves even higher accuracies, 98.89%, 88.50% and 64.17%. We believe that our method has a great contribution for closing the accuracy gap between private and non-private image classification.
CRAug 4, 2025Code
Coward: Toward Practical Proactive Federated Backdoor Defense via Collision-based WatermarkWenjie Li, Siying Gu, Yiming Li et al.
Backdoor detection is currently the mainstream defense against backdoor attacks in federated learning (FL), where malicious clients upload poisoned updates that compromise the global model and undermine the reliability of FL deployments. Existing backdoor detection techniques fall into two categories, including passive and proactive ones, depending on whether the server proactively modifies the global model. However, both have inherent limitations in practice: passive defenses are vulnerable to common non-i.i.d. data distributions and random participation of FL clients, whereas current proactive defenses suffer inevitable out-of-distribution (OOD) bias because they rely on backdoor co-existence effects. To address these issues, we introduce a new proactive defense, dubbed Coward, inspired by our discovery of multi-backdoor collision effects, in which consecutively planted, distinct backdoors significantly suppress earlier ones. In general, we detect attackers by evaluating whether the server-injected, conflicting global watermark is erased during local training rather than retained. Our method preserves the advantages of proactive defenses in handling data heterogeneity (\ie, non-i.i.d. data) while mitigating the adverse impact of OOD bias through a revised detection mechanism. Extensive experiments on benchmark datasets confirm the effectiveness of Coward and its resilience to potential adaptive attacks. The code for our method would be available at https://github.com/still2009/cowardFL.
LGNov 6, 2023
DP-DCAN: Differentially Private Deep Contrastive Autoencoder Network for Single-cell ClusteringHuifa Li, Jie Fu, Zhili Chen et al.
Single-cell RNA sequencing (scRNA-seq) is important to transcriptomic analysis of gene expression. Recently, deep learning has facilitated the analysis of high-dimensional single-cell data. Unfortunately, deep learning models may leak sensitive information about users. As a result, Differential Privacy (DP) is increasingly used to protect privacy. However, existing DP methods usually perturb whole neural networks to achieve differential privacy, and hence result in great performance overheads. To address this challenge, in this paper, we take advantage of the uniqueness of the autoencoder that it outputs only the dimension-reduced vector in the middle of the network, and design a Differentially Private Deep Contrastive Autoencoder Network (DP-DCAN) by partial network perturbation for single-cell clustering. Since only partial network is added with noise, the performance improvement is obvious and twofold: one part of network is trained with less noise due to a bigger privacy budget, and the other part is trained without any noise. Experimental results of six datasets have verified that DP-DCAN is superior to the traditional DP scheme with whole network perturbation. Moreover, DP-DCAN demonstrates strong robustness to adversarial attacks.
LGSep 5, 2025Code
Safeguarding Graph Neural Networks against Topology Inference AttacksJie Fu, Yuan Hong, Zhili Chen et al.
Graph Neural Networks (GNNs) have emerged as powerful models for learning from graph-structured data. However, their widespread adoption has raised serious privacy concerns. While prior research has primarily focused on edge-level privacy, a critical yet underexplored threat lies in topology privacy - the confidentiality of the graph's overall structure. In this work, we present a comprehensive study on topology privacy risks in GNNs, revealing their vulnerability to graph-level inference attacks. To this end, we propose a suite of Topology Inference Attacks (TIAs) that can reconstruct the structure of a target training graph using only black-box access to a GNN model. Our findings show that GNNs are highly susceptible to these attacks, and that existing edge-level differential privacy mechanisms are insufficient as they either fail to mitigate the risk or severely compromise model accuracy. To address this challenge, we introduce Private Graph Reconstruction (PGR), a novel defense framework designed to protect topology privacy while maintaining model accuracy. PGR is formulated as a bi-level optimization problem, where a synthetic training graph is iteratively generated using meta-gradients, and the GNN model is concurrently updated based on the evolving graph. Extensive experiments demonstrate that PGR significantly reduces topology leakage with minimal impact on model accuracy. Our code is available at https://github.com/JeffffffFu/PGR.
LGAug 7, 2025Code
scAGC: Learning Adaptive Cell Graphs with Contrastive Guidance for Single-Cell ClusteringHuifa Li, Jie Fu, Xinlin Zhuang et al.
Accurate cell type annotation is a crucial step in analyzing single-cell RNA sequencing (scRNA-seq) data, which provides valuable insights into cellular heterogeneity. However, due to the high dimensionality and prevalence of zero elements in scRNA-seq data, traditional clustering methods face significant statistical and computational challenges. While some advanced methods use graph neural networks to model cell-cell relationships, they often depend on static graph structures that are sensitive to noise and fail to capture the long-tailed distribution inherent in single-cell populations.To address these limitations, we propose scAGC, a single-cell clustering method that learns adaptive cell graphs with contrastive guidance. Our approach optimizes feature representations and cell graphs simultaneously in an end-to-end manner. Specifically, we introduce a topology-adaptive graph autoencoder that leverages a differentiable Gumbel-Softmax sampling strategy to dynamically refine the graph structure during training. This adaptive mechanism mitigates the problem of a long-tailed degree distribution by promoting a more balanced neighborhood structure. To model the discrete, over-dispersed, and zero-inflated nature of scRNA-seq data, we integrate a Zero-Inflated Negative Binomial (ZINB) loss for robust feature reconstruction. Furthermore, a contrastive learning objective is incorporated to regularize the graph learning process and prevent abrupt changes in the graph topology, ensuring stability and enhancing convergence. Comprehensive experiments on 9 real scRNA-seq datasets demonstrate that scAGC consistently outperforms other state-of-the-art methods, yielding the best NMI and ARI scores on 9 and 7 datasets, respectively.Our code is available at Anonymous Github.
LGMay 21, 2025Code
EC-LDA : Label Distribution Inference Attack against Federated Graph Learning with Embedding CompressionTong Cheng, Jie Fu, Xinpeng Ling et al.
Graph Neural Networks (GNNs) have been widely used for graph analysis. Federated Graph Learning (FGL) is an emerging learning framework to collaboratively train graph data from various clients. Although FGL allows client data to remain localized, a malicious server can still steal client private data information through uploaded gradient. In this paper, we for the first time propose label distribution attacks (LDAs) on FGL that aim to infer the label distributions of the client-side data. Firstly, we observe that the effectiveness of LDA is closely related to the variance of node embeddings in GNNs. Next, we analyze the relation between them and propose a new attack named EC-LDA, which significantly improves the attack effectiveness by compressing node embeddings. Then, extensive experiments on node classification and link prediction tasks across six widely used graph datasets show that EC-LDA outperforms the SOTA LDAs. Specifically, EC-LDA can achieve the Cos-sim as high as 1.0 under almost all cases. Finally, we explore the robustness of EC-LDA under differential privacy protection and discuss the potential effective defense methods to EC-LDA. Our code is available at https://github.com/cheng-t/EC-LDA.
CVMay 17, 2019Code
Learning to Reconstruct 3D Manhattan Wireframes from a Single ImageYichao Zhou, Haozhi Qi, Yuexiang Zhai et al.
In this paper, we propose a method to obtain a compact and accurate 3D wireframe representation from a single image by effectively exploiting global structural regularities. Our method trains a convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points. Compared with the state-of-the-art learning-based wireframe detection methods, our network is simpler and more unified, leading to better 2D wireframe detection. With global structural priors from parallelism, our method further reconstructs a full 3D wireframe model, a compact vector representation suitable for a variety of high-level vision tasks such as AR and CAD. We conduct extensive evaluations on a large synthetic dataset of urban scenes as well as real images. Our code and datasets have been made public at https://github.com/zhou13/shapeunity.
CRMay 14, 2024
Differentially Private Federated Learning: A Systematic ReviewJie Fu, Yuan Hong, Xinpeng Ling et al.
In recent years, privacy and security concerns in machine learning have promoted trusted federated learning to the forefront of research. Differential privacy has emerged as the de facto standard for privacy protection in federated learning due to its rigorous mathematical foundation and provable guarantee. Despite extensive research on algorithms that incorporate differential privacy within federated learning, there remains an evident deficiency in systematic reviews that categorize and synthesize these studies. Our work presents a systematic overview of the differentially private federated learning. Existing taxonomies have not adequately considered objects and level of privacy protection provided by various differential privacy models in federated learning. To rectify this gap, we propose a new taxonomy of differentially private federated learning based on definition and guarantee of various differential privacy models and federated scenarios. Our classification allows for a clear delineation of the protected objects across various differential privacy models and their respective neighborhood levels within federated learning environments. Furthermore, we explore the applications of differential privacy in federated learning scenarios. Our work provide valuable insights into privacy-preserving federated learning and suggest practical directions for future research.
CVNov 20, 2024
Hints of Prompt: Enhancing Visual Representation for Multimodal LLMs in Autonomous DrivingHao Zhou, Zhanning Gao, Zhili Chen et al.
In light of the dynamic nature of autonomous driving environments and stringent safety requirements, general MLLMs combined with CLIP alone often struggle to accurately represent driving-specific scenarios, particularly in complex interactions and long-tail cases. To address this, we propose the Hints of Prompt (HoP) framework, which introduces three key enhancements: Affinity hint to emphasize instance-level structure by strengthening token-wise connections, Semantic hint to incorporate high-level information relevant to driving-specific cases, such as complex interactions among vehicles and traffic signs, and Question hint to align visual features with the query context, focusing on question-relevant regions. These hints are fused through a Hint Fusion module, enriching visual representations by capturing driving-related representations with limited domain data, ensuring faster adaptation to driving scenarios. Extensive experiments confirm the effectiveness of the HoP framework, showing that it significantly outperforms previous state-of-the-art methods in all key metrics.
LGDec 21, 2024
CBNN: 3-Party Secure Framework for Customized Binary Neural Networks InferenceBenchang Dong, Zhili Chen, Xin Chen et al.
Binarized Neural Networks (BNN) offer efficient implementations for machine learning tasks and facilitate Privacy-Preserving Machine Learning (PPML) by simplifying operations with binary values. Nevertheless, challenges persist in terms of communication and accuracy in their application scenarios. In this work, we introduce CBNN, a three-party secure computation framework tailored for efficient BNN inference. Leveraging knowledge distillation and separable convolutions, CBNN transforms standard BNNs into MPC-friendly customized BNNs, maintaining high utility. It performs secure inference using optimized protocols for basic operations. Specifically, CBNN enhances linear operations with replicated secret sharing and MPC-friendly convolutions, while introducing a novel secure activation function to optimize non-linear operations. We demonstrate the effectiveness of CBNN by transforming and securely implementing several typical BNN models. Experimental results indicate that CBNN maintains impressive performance even after customized binarization and security measures
CVMar 10, 2024
Cross-Cluster Shifting for Efficient and Effective 3D Object Detection in Autonomous DrivingZhili Chen, Kien T. Pham, Maosheng Ye et al.
We present a new 3D point-based detector model, named Shift-SSD, for precise 3D object detection in autonomous driving. Traditional point-based 3D object detectors often employ architectures that rely on a progressive downsampling of points. While this method effectively reduces computational demands and increases receptive fields, it will compromise the preservation of crucial non-local information for accurate 3D object detection, especially in the complex driving scenarios. To address this, we introduce an intriguing Cross-Cluster Shifting operation to unleash the representation capacity of the point-based detector by efficiently modeling longer-range inter-dependency while including only a negligible overhead. Concretely, the Cross-Cluster Shifting operation enhances the conventional design by shifting partial channels from neighboring clusters, which enables richer interaction with non-local regions and thus enlarges the receptive field of clusters. We conduct extensive experiments on the KITTI, Waymo, and nuScenes datasets, and the results demonstrate the state-of-the-art performance of Shift-SSD in both detection accuracy and runtime efficiency.
60.5CRMar 13
Bipartite Randomized Response Mechanism for Local Differential PrivacyShun Zhang, Hai Zhu, Zhili Chen et al.
With the increasing importance of data privacy, Local Differential Privacy (LDP) has recently become a strong measure of privacy for protecting each user's privacy from data analysts without relying on a trusted third party. In this paper, we consider the problem of high-utility differentially private release. Given a domain of items and a distance-defined utility function, our goal is to design a differentially private mechanism that releases an item with the global expected error as small as possible. The most common LDP mechanism for this task is the Generalized Randomized Response (GRR) mechanism that treats all candidate items equally except for the true item. In this paper, we introduce Bipartite Randomized Response mechanism (BRR), which adaptively divides all candidate items into two parts by utility rankings. In the local search phase, we confirm how many high-utility candidates to be assigned with high release probability, which gives the locally optimal bipartite classification of all candidates. For preserving LDP, the global search phase uniformly selects the smallest number of dynamic high-utility candidates obtained locally. In particular, we give explicit formulas on the uniform number of dynamic high-utility candidates. The global expected error of our BRR can theoretically deliver a decrease with an asymptotically exact ratio, and when the privacy budget is set to $3$ the expected error can be reduced by $66.4\%$. Extensive experiments demonstrate that BRR outperforms the state-of-the-art methods across the standard metrics and datasets.
CRJan 27, 2022
Geo-MOEA: A Multi-Objective Evolutionary Algorithm with Geo-obfuscation for Mobile Crowdsourcing WorkersShun Zhang, Tao Zhang, Zhili Chen et al.
The rapid development of mobile Internet and sharing economy brings the prosperity of Spatial Crowdsourcing (SC). SC applications assign various tasks according to reported location information of task's requesters and outsourced workers (such as DiDi, MeiTuan and Uber). However, SC-servers are often untrustworthy and the exposure of users' locations raises privacy concerns. In this paper, we design a framework called Geo-MOEA (Multi-Objective Evolutionary Algorithm with Geo-obfuscation) to protect location privacy of workers involved on SC platform in mobile networks environment. We propose an adaptive regionalized obfuscation approach with inference error bounds based on geo-indistinguishability (a strong notion of differential privacy), which is suitable for the context of large-scale location data and task allocations. This enables each worker to report a pseudo-location that is adaptively generated with a personalized inference error threshold. Moreover, as a popular computational intelligence method, MOEA is introduced to optimize the trade-off between SC service availability and privacy protection while ensuring theoretically the most general condition on protection location sets for larger search space. Finally, the experimental results on two public datasets show that our Geo-MOEA approach achieves up to 20% reduction in service quality loss while guaranteeing differential and geo-distortion location privacy.
CRFeb 1, 2021
DPIVE: A Regionalized Location Obfuscation Scheme with Personalized Privacy LevelsShun Zhang, Pengfei Lan, Benfei Duan et al.
The popularity of cyber-physical systems is fueling the rapid growth of location-based services. This poses the risk of location privacy disclosure. Effective privacy preservation is foremost for various mobile applications. Recently, geo-indistinguishability and expected inference error are proposed for limiting location leakages. In this paper, we argue that personalization means regionalization for geo-indistinguishability, and we propose a regionalized location obfuscation mechanism called DPIVE with personalized utility sensitivities. This substantially corrects the differential and distortion privacy problem of PIVE framework proposed by Yu et al. on NDSS 2017. We develop DPIVE with two phases. In Phase I, we determine disjoint sets by partitioning all possible positions such that different locations in the same set share the Protection Location Set (PLS). In Phase II, we construct a probability distribution matrix in which the rows corresponding to the same PLS have their own sensitivity of utility (PLS diameter). Moreover, by designing QK-means algorithm for more search space in 2-D space, we improve DPIVE with refined location partition and present fine-grained personalization, enabling each location to have its own privacy level endowed with a customized privacy budget. Experiments with two public datasets demonstrate that our mechanisms have the superior performance, typically on skewed locations.
CROct 3, 2020
DCDChain: A Credible Architecture of Digital Copyright Detection Based on BlockchainZhili Chen, Yuting Wang, Tianjiao Ni
Copyright detection is an effective method to prevent piracy. However, untrustworthy detection parties may lead to falsified detection results. Due to its credibility and tamper resistance, blockchain has been applied to copyright protection. Previous works mainly utilized blockchain for reliable copyright information storage or copyrighted digital media trading. As far as we know, the problem of credible copyright detection has not been addressed. In this paper, we propose a credible copyright detection architecture based on the blockchain, called DCDChain. In this architecture, the detection agency first detects copyrights off the chain, then uploads the detection records to the blockchain. Since data on the blockchain are publicly accessible, media providers can verify the correctness of the copyright detection, and appeal to a smart contract if there is any dissent. The smart contract then arbitrates the disputes by verifying the correctness of detection on the chain. The detect-verify-and-arbitrate mechanism guarantees the credibility of copyright detection. Security analysis and experimental simulations show that the digital copyright detection architecture is credible, secure and efficient. The proposed credible copyright detection scheme is highly important for copyright protection. The future work is to improve the scheme by designing more effective locality sensitive hash algorithms for various digital media.
CROct 3, 2020
Utility-efficient Differentially Private K-means Clustering based on Cluster MergingTianjiao Ni, Minghao Qiao, Zhili Chen et al.
Differential privacy is widely used in data analysis. State-of-the-art $k$-means clustering algorithms with differential privacy typically add an equal amount of noise to centroids for each iterative computation. In this paper, we propose a novel differentially private $k$-means clustering algorithm, DP-KCCM, that significantly improves the utility of clustering by adding adaptive noise and merging clusters. Specifically, to obtain $k$ clusters with differential privacy, the algorithm first generates $n \times k$ initial centroids, adds adaptive noise for each iteration to get $n \times k$ clusters, and finally merges these clusters into $k$ ones. We theoretically prove the differential privacy of the proposed algorithm. Surprisingly, extensive experimental results show that: 1) cluster merging with equal amounts of noise improves the utility somewhat; 2) although adding adaptive noise only does not improve the utility, combining both cluster merging and adaptive noise further improves the utility significantly.
CRAug 8, 2020
A Differentially Private Framework for Spatial Crowdsourcing with Historical Data LearningShun Zhang, Benfei Duan, Zhili Chen et al.
Spatial crowdsourcing (SC) is an increasing popular category of crowdsourcing in the era of mobile Internet and sharing economy. It requires workers to arrive at a particular location for task fulfillment. Effective protection of location privacy is essential for workers' enthusiasm and valid task assignment. However, existing SC models with differential privacy usually perturb real-time location data for both partition and data publication. Such a way may produce large perturbations to counting queries that affect assignment success rate and allocation accuracy. This paper proposes a framework (R-HT) for protecting location privacy of workers taking advantage of both real-time and historical data. We simulate locations by sampling the probability distribution learned from historical data, use them for grid partition, and then publish real-time data under this partitioning with differential privacy. This realizes that most privacy budget is allocated to the worker count of each cell and yields an improved Private Spatial Decomposition approach. Moreover, we introduce some strategies for geocast region construction, including quality scoring function and local maximum geocast radius. A series of experimental results on real-world datasets shows that R-HT attains a stable success rate of task assignment, saves performance overhead and fits for dynamic assignment on crowdsourcing platforms.
CVAug 7, 2020
HoliCity: A City-Scale Data Platform for Learning Holistic 3D StructuresYichao Zhou, Jingwei Huang, Xili Dai et al.
We present HoliCity, a city-scale 3D dataset with rich structural information. Currently, this dataset has 6,300 real-world panoramas of resolution $13312 \times 6656$ that are accurately aligned with the CAD model of downtown London with an area of more than 20 km$^2$, in which the median reprojection error of the alignment of an average image is less than half a degree. This dataset aims to be an all-in-one data platform for research of learning abstracted high-level holistic 3D structures that can be derived from city CAD models, e.g., corners, lines, wireframes, planes, and cuboids, with the ultimate goal of supporting real-world applications including city-scale reconstruction, localization, mapping, and augmented reality. The accurate alignment of the 3D CAD models and panoramas also benefits low-level 3D vision tasks such as surface normal estimation, as the surface normal extracted from previous LiDAR-based datasets is often noisy. We conduct experiments to demonstrate the applications of HoliCity, such as predicting surface segmentation, normal maps, depth maps, and vanishing points, as well as test the generalizability of methods trained on HoliCity and other related datasets. HoliCity is available at https://holicity.io.
CRJul 29, 2020
Secure Computation Framework for Multiple Data Providers Against Malicious AdversariesZhili Chen, Xin Chen
Due to the great development of secure multi-party computation, many practical secure computation schemes have been proposed. As an example, different secure auction mechanisms have been widely studied, which can protect bid privacy while satisfying various economic properties. However, as far as we know, none of them solve the secure computation problems for multiple data providers (e.g., secure cloud resource auctions) in the malicious security model. In this paper, we use the techniques of cut-and-choose and garbled circuits to propose a general secure computation framework for multiple data providers against malicious adversaries. Specifically, our framework checks input consistency with the cut-and-choose paradigm, conducts maliciously secure computations by running two independent garbled circuits, and verifies the correctness of output by comparing two versions of outputs. Theoretical analysis shows that our framework is secure against a malicious computation party, or a subset of malicious data providers. Taking secure cloud resource auctions as an example, we implement our framework. Extensive experimental evaluations show that the performance of the proposed framework is acceptable in practice.
CVApr 15, 2020
Intuitive, Interactive Beard and Hair Synthesis with Generative ModelsKyle Olszewski, Duygu Ceylan, Jun Xing et al.
We present an interactive approach to synthesizing realistic variations in facial hair in images, ranging from subtle edits to existing hair to the addition of complex and challenging hair in images of clean-shaven subjects. To circumvent the tedious and computationally expensive tasks of modeling, rendering and compositing the 3D geometry of the target hairstyle using the traditional graphics pipeline, we employ a neural network pipeline that synthesizes realistic and detailed images of facial hair directly in the target image in under one second. The synthesis is controlled by simple and sparse guide strokes from the user defining the general structural and color properties of the target hairstyle. We qualitatively and quantitatively evaluate our chosen method compared to several alternative approaches. We show compelling interactive editing results with a prototype user interface that allows novice users to progressively refine the generated image to match their desired hairstyle, and demonstrate that our approach also allows for flexible and high-fidelity scalp hair synthesis.
CVFeb 24, 2020
Anatomy-aware 3D Human Pose Estimation with Bone-based Pose DecompositionTianlang Chen, Chen Fang, Xiaohui Shen et al.
In this work, we propose a new solution to 3D human pose estimation in videos. Instead of directly regressing the 3D joint locations, we draw inspiration from the human skeleton anatomy and decompose the task into bone direction prediction and bone length prediction, from which the 3D joint locations can be completely derived. Our motivation is the fact that the bone lengths of a human skeleton remain consistent across time. This promotes us to develop effective techniques to utilize global information across all the frames in a video for high-accuracy bone length prediction. Moreover, for the bone direction prediction network, we propose a fully-convolutional propagating architecture with long skip connections. Essentially, it predicts the directions of different bones hierarchically without using any time-consuming memory units e.g. LSTM). A novel joint shift loss is further introduced to bridge the training of the bone length and bone direction prediction networks. Finally, we employ an implicit attention mechanism to feed the 2D keypoint visibility scores into the model as extra guidance, which significantly mitigates the depth ambiguity in many challenging poses. Our full model outperforms the previous best results on Human3.6M and MPI-INF-3DHP datasets, where comprehensive evaluation validates the effectiveness of our model.
CRJan 3, 2020
Fair Auction and Trade Framework for Cloud VM Allocation based on BlockchainZhili Chen, Wei Ding, Yan Xu et al.
Cloud auctions provide cost-effective strategies for cloud VM allocation. Most existing cloud auctions simply assume that the auctioneer is trustable, and thus the fairness of auctions can be easily achieved. However, in fact, such a trustable auctioneer may not exist, and the fairness is non-trivial to guarantee. In this work, for the first time, we propose a decentralized cloud VM auction and trade framework based on blockchain. We realize both auction fairness and trade fairness among participants (e.g., cloud provider and cloud users) in this system, which guarantees the interest of each party will not suffer any loss as long as it follows the protocol. Furthermore, we implement our system through the local blockchain and Ethereum official test blockchain, carry out experimental simulations, and demonstrate the feasibility of our system.
CRJan 3, 2020
Differentially Private Combinatorial Cloud AuctionTianjiao Ni, Zhili Chen, Lin Chen et al.
Cloud service providers typically provide different types of virtual machines (VMs) to cloud users with various requirements. Thanks to its effectiveness and fairness, auction has been widely applied in this heterogeneous resource allocation. Recently, several strategy-proof combinatorial cloud auction mechanisms have been proposed. However, they fail to protect the bid privacy of users from being inferred from the auction results. In this paper, we design a differentially private combinatorial cloud auction mechanism (DPCA) to address this privacy issue. Technically, we employ the exponential mechanism to compute a clearing unit price vector with a probability proportional to the corresponding revenue. We further improve the mechanism to reduce the running time while maintaining high revenues, by computing a single clearing unit price, or a subgroup of clearing unit prices at a time, resulting in the improved mechanisms DPCA-S and its generalized version DPCA-M, respectively. We theoretically prove that our mechanisms can guarantee differential privacy, approximate truthfulness and high revenue. Extensive experimental results demonstrate that DPCA can generate near-optimal revenues at the price of relatively high time complexity, while the improved mechanisms achieve a tunable trade-off between auction revenue and running time.
CRSep 17, 2019
Privacy-preserving Double Auction Mechanism Based on Homomorphic Encryption and Sorting NetworksYin Xu, Zhili Chen, Hong Zhong
As an effective resource allocation approach, double auctions (DAs) have been extensively studied in electronic commerce. Most previous studies have focused on how to design strategy-proof DA mechanisms, while not much research effort has been done concerning privacy and security issues. However, security, especially privacy issues have become such a public concern that the European governments lay down the law to enforce the privacy guarantees recently. In this paper, to address the privacy issue in electronic auctions, we concentrate on how to design a privacy-preserving mechanism for double auctions by employing Goldwasser-Micali homomorphic encryption and sorting networks. We achieve provable privacy such that the auctions do not reveal any bid information except the auction results, resulting in a strict privacy guarantee. Moreover, to achieve practical system performance, we compare different sorting algorithms, and suggest using the faster ones. Experimental results show that different sorting algorithms may have great effect on the performance of our mechanism, and demonstrate the practicality of our protocol for real-world applications in electronic commerce.
CRAug 10, 2019
Efficient Three-party Computation: An Information-theoretic Approach from Cut-and-ChooseZhili Chen
As far as we know, the literature on secure computation from cut-and-choose has focused on achieving computational security against malicious adversaries. It is unclear whether the idea of cut-and-choose can be adapted to secure computation with information-theoretic security. In this work we explore the possibility of using cut-and-choose in information theoretic setting for secure three-party computation (3PC). Previous work on 3PC has mainly focus on the semi-honest case, and is motivated by the observation that real-word deployments of multi-party computation (MPC) seem to involve few parties. We propose a new protocol for information-theoretically secure 3PC tolerating one malicious party with cheating probability $2^{-s}$ using $s$ runs of circuit computation in the cut-and-choose paradigm. The computational cost of our protocol is essentially only a small constant worse than that of state-of-the-art 3PC protocols against a semi-honest corruption, while its communication round is greatly reduced compared to other maliciously secure 3PC protocols in information-theoretic setting.
CRAug 10, 2019
Differentially Private Aggregated Mobility Data Publication Using Moving CharacteristicsZhili Chen, Xiaoli Kan, Shun Zhang et al.
With the rapid development of GPS enabled devices (smartphones) and location-based applications, location privacy is increasingly concerned. Intuitively, it is widely believed that location privacy can be preserved by publishing aggregated mobility data, such as the number of users in an area at some time. However, a recent attack shows that these aggregated mobility data can be exploited to recover individual trajectories. In this paper, we first propose two differentially private basic schemes for aggregated mobility data publication, namely direct perturbation and threshold perturbation, which preserve location privacy of users and especially resist the trajectory recovery attack. Then, we explore the moving characteristics of mobile users, and design an improved scheme named static hybrid perturbation by combining the two basic schemes according to the moving characteristics. Since static hybrid perturbation works only for static data, which are entirely available before publishing, we further adapt the static hybrid perturbation by combining it with linear regression, and yield another improved scheme named dynamic hybrid perturbation. The dynamic hybrid perturbation works also for dynamic data, which are generated on the fly during publication. Privacy analysis shows that the proposed schemes achieve differential privacy. Extensive experiments on both simulated and real datasets demonstrate that all proposed schemes resist the trajectory recovery attack well, and the improved schemes significantly outperform the basic schemes.
CRDec 5, 2018
Differentially Private User-based Collaborative Filtering Recommendation Based on K-means ClusteringZhili Chen, Yu Wang, Shun Zhang et al.
Collaborative filtering (CF) recommendation algorithms are well-known for their outstanding recommendation performances, but previous researches showed that they could cause privacy leakage for users due to k-nearest neighboring (KNN) attacks. Recently, the notion of differential privacy (DP) has been applied to privacy preservation for collaborative filtering recommendation algorithms. However, as far as we know, existing differentially private CF recommendation schemes degraded the recommendation performance (such as recall and precision) to an unacceptable level. In this paper, in order to address the performance degradation problem, we propose a differentially private user-based collaborative filtering recommendation scheme based on k-means clustering (KDPCF). Specifically, to improve the recommendation performance, we first cluster the dataset into categories by k-means clustering and appropriately adjust the size of the target category to which the target user belongs, so that only users in the well-sized target category can be used for recommendations. Then we efficiently select a set of neighbors from the target category at one time by employing only one exponential mechanism instead of the composition of multiple ones, and base on the neighbor set to recommend. We theoretically prove that our scheme achieves differential privacy. Empirically, we use the MovieLens dataset to evaluate our recommendation system. The experimental results demonstrate a significant performance gain compared to existing schemes.
CROct 19, 2018
Probabilistic Matrix Factorization with Personalized Differential PrivacyShun Zhang, Laixiang Liu, Zhili Chen et al.
Probabilistic matrix factorization (PMF) plays a crucial role in recommendation systems. It requires a large amount of user data (such as user shopping records and movie ratings) to predict personal preferences, and thereby provides users high-quality recommendation services, which expose the risk of leakage of user privacy. Differential privacy, as a provable privacy protection framework, has been applied widely to recommendation systems. It is common that different individuals have different levels of privacy requirements on items. However, traditional differential privacy can only provide a uniform level of privacy protection for all users. In this paper, we mainly propose a probabilistic matrix factorization recommendation scheme with personalized differential privacy (PDP-PMF). It aims to meet users' privacy requirements specified at the item-level instead of giving the same level of privacy guarantees for all. We then develop a modified sampling mechanism (with bounded differential privacy) for achieving PDP. We also perform a theoretical analysis of the PDP-PMF scheme and demonstrate the privacy of the PDP-PMF scheme. In addition, we implement the probabilistic matrix factorization schemes both with traditional and with personalized differential privacy (DP-PMF, PDP-PMF) and compare them through a series of experiments. The results show that the PDP-PMF scheme performs well on protecting the privacy of each user and its recommendation quality is much better than the DP-PMF scheme.
CROct 19, 2018
PP-MCSA: Privacy Preserving Multi-Channel Double Spectrum AuctionZhili Chen, Sheng Chen, Hong Zhong et al.
Auction is widely regarded as an effective way in dynamic spectrum redistribution. Recently, considerable research efforts have been devoted to designing privacy-preserving spectrum auctions in a variety of auction settings. However, none of existing work has addressed the privacy issue in the most generic scenario, double spectrum auctions where each seller sells multiple channels and each buyer buys multiple channels. To fill this gap, in this paper we propose PP-MCSA, a Privacy Preserving mechanism for Multi-Channel double Spectrum Auctions. Technically, by leveraging garbled circuits, we manage to protect the privacy of both sellers' requests and buyers' bids in multi-channel double spectrum auctions. As far as we know, PP-MCSA is the first privacy-preserving solution for multi-channel double spectrum auctions. We further theoretically demonstrate the privacy guarantee of PP-MCSA, and extensively evaluate its performance via experiments. Experimental results show that PP-MCSA incurs only moderate communication and computation overhead.
CROct 18, 2018
Making Double Spectrum Auction Practical: Both Privacy and Efficiency MatterZhili Chen, Xuemei Wei, Hong Zhong et al.
Truthful spectrum auction is believed to be an effective method for spectrum redistribution. However, privacy concerns have largely hampered the practical applications of truthful spectrum auctions. In this paper, to make the applications of double spectrum auctions practical, we present a privacy-preserving and socially efficient double spectrum auction design, SDSA. Specifically, by combining three security techniques: homomorphic encryption, secret sharing and garbled circuits, we design a secure two-party protocol computing a socially efficient double spectrum auction, TDSA, without leaking any information about sellers' requests or buyers' bids beyond the auction outcome. We give the formal security definition in our context, and theoretically prove the security that our design achieves. Experimental results show that our design is also efficient in performance, even for large-scale double spectrum auctions.
CROct 18, 2018
Differentially Private Double Spectrum Auction with Approximate Social Welfare MaximizationZhili Chen, Tianjiao Ni, Hong Zhong et al.
Spectrum auction is an effective approach to improving spectrum utilization, by leasing idle spectrum from primary users to secondary users. Recently, a few differentially private spectrum auction mechanisms have been proposed, but, as far as we know, none of them addressed the differential privacy in the setting of double spectrum auctions. In this paper, we combine the concept of differential privacy with double spectrum auction design, and present a Differentially private Double spectrum auction mechanism with approximate Social welfare Maximization (DDSM). Specifically, we design the mechanism by employing the exponential mechanism to select clearing prices for the double spectrum auction with probabilities exponentially proportional to the related social welfare values, and then improve the mechanism in several aspects like the designs of the auction algorithm, the utility function and the buyer grouping algorithm. Through theoretical analysis, we prove that DDSM achieves differential privacy, approximate truthfulness, approximate social welfare maximization. Extensive experimental evaluations show that DDSM achieves a good performance in term of social welfare.
CVOct 14, 2018
Learning to Sketch with Deep Q Networks and Demonstrated StrokesTao Zhou, Chen Fang, Zhaowen Wang et al.
Doodling is a useful and common intelligent skill that people can learn and master. In this work, we propose a two-stage learning framework to teach a machine to doodle in a simulated painting environment via Stroke Demonstration and deep Q-learning (SDQ). The developed system, Doodle-SDQ, generates a sequence of pen actions to reproduce a reference drawing and mimics the behavior of human painters. In the first stage, it learns to draw simple strokes by imitating in supervised fashion from a set of strokeaction pairs collected from artist paintings. In the second stage, it is challenged to draw real and more complex doodles without ground truth actions; thus, it is trained with Qlearning. Our experiments confirm that (1) doodling can be learned without direct stepby- step action supervision and (2) pretraining with stroke demonstration via supervised learning is important to improve performance. We further show that Doodle-SDQ is effective at producing plausible drawings in different media types, including sketch and watercolor.
CRJul 29, 2013
PS-TRUST: Provably Secure Solution for Truthful Double Spectrum AuctionsZhili Chen, Liusheng Huang, Lu Li et al.
Truthful spectrum auctions have been extensively studied in recent years. Truthfulness makes bidders bid their true valuations, simplifying greatly the analysis of auctions. However, revealing one's true valuation causes severe privacy disclosure to the auctioneer and other bidders. To make things worse, previous work on secure spectrum auctions does not provide adequate security. In this paper, based on TRUST, we propose PS-TRUST, a provably secure solution for truthful double spectrum auctions. Besides maintaining the properties of truthfulness and special spectrum reuse of TRUST, PS-TRUST achieves provable security against semi-honest adversaries in the sense of cryptography. Specifically, PS-TRUST reveals nothing about the bids to anyone in the auction, except the auction result. To the best of our knowledge, PS-TRUST is the first provably secure solution for spectrum auctions. Furthermore, experimental results show that the computation and communication overhead of PS-TRUST is modest, and its practical applications are feasible.
CLAug 15, 2012
More than Word Frequencies: Authorship Attribution via Natural Frequency Zoned Word Distribution AnalysisZhili Chen, Liusheng Huang, Wei Yang et al.
With such increasing popularity and availability of digital text data, authorships of digital texts can not be taken for granted due to the ease of copying and parsing. This paper presents a new text style analysis called natural frequency zoned word distribution analysis (NFZ-WDA), and then a basic authorship attribution scheme and an open authorship attribution scheme for digital texts based on the analysis. NFZ-WDA is based on the observation that all authors leave distinct intrinsic word usage traces on texts written by them and these intrinsic styles can be identified and employed to analyze the authorship. The intrinsic word usage styles can be estimated through the analysis of word distribution within a text, which is more than normal word frequency analysis and can be expressed as: which groups of words are used in the text; how frequently does each group of words occur; how are the occurrences of each group of words distributed in the text. Next, the basic authorship attribution scheme and the open authorship attribution scheme provide solutions for both closed and open authorship attribution problems. Through analysis and extensive experimental studies, this paper demonstrates the efficiency of the proposed method for authorship attribution.