LGSep 6, 2024
Exploiting Missing Data Remediation Strategies using Adversarial Missingness AttacksDeniz Koyuncu, Alex Gittens, Bülent Yener et al.
Adversarial Missingness (AM) attacks aim to manipulate model fitting by carefully engineering a missing data problem to achieve a specific malicious objective. AM attacks are significantly different from prior data poisoning attacks in that no malicious data inserted and no data is maliciously perturbed. Current AM attacks are feasible only under the assumption that the modeler (victim) uses full-information maximum likelihood methods to handle missingness. This work aims to remedy this limitation of AM attacks; in the approach taken here, the adversary achieves their goal by solving a bi-level optimization problem to engineer the adversarial missingness mechanism, where the lower level problem incorporates a differentiable approximation of the targeted missingness remediation technique. As instantiations of this framework, AM attacks are provided for three popular techniques: (i) complete case analysis, (ii) mean imputation, and (iii) regression-based imputation for general empirical risk minimization (ERM) problems. Experiments on real-world data show that AM attacks are successful with modest levels of missingness (less than 20%). Furthermore, we show on the real-world Twins dataset that AM attacks can manipulate the estimated average treatment effect (ATE) as an instance of the general ERM problems: the adversary succeeds in not only reversing the sign, but also in substantially inflating the ATE values from a true value of -1.61% to a manipulated one as high as 10%. These experimental results hold when the ATE is calculated using multiple regression-based estimators with different architectures, even when the adversary is restricted to modifying only a subset of the training data.
LGMay 31, 2023
Deception by Omission: Using Adversarial Missingness to Poison Causal Structure LearningDeniz Koyuncu, Alex Gittens, Bülent Yener et al.
Inference of causal structures from observational data is a key component of causal machine learning; in practice, this data may be incompletely observed. Prior work has demonstrated that adversarial perturbations of completely observed training data may be used to force the learning of inaccurate causal structural models (SCMs). However, when the data can be audited for correctness (e.g., it is crytographically signed by its source), this adversarial mechanism is invalidated. This work introduces a novel attack methodology wherein the adversary deceptively omits a portion of the true training data to bias the learned causal structures in a desired manner. Theoretically sound attack mechanisms are derived for the case of arbitrary SCMs, and a sample-efficient learning-based heuristic is given for Gaussian SCMs. Experimental validation of these approaches on real and synthetic data sets demonstrates the effectiveness of adversarial missingness attacks at deceiving popular causal structure learning algorithms.
CRNov 17, 2021
Privacy Guarantees of BLE Contact Tracing: A Case Study on COVIDWISESalman Ahmed, Ya Xiao, Taejoong et al.
Google and Apple jointly introduced a digital contact tracing technology and an API called "exposure notification," to help health organizations and governments with contact tracing. The technology and its interplay with security and privacy constraints require investigation. In this study, we examine and analyze the security, privacy, and reliability of the technology with actual and typical scenarios (and expected typical adversary in mind), and quite realistic use cases. We do it in the context of Virginia's COVIDWISE app. This experimental analysis validates the properties of the system under the above conditions, a result that seems crucial for the peace of mind of the exposure notification technology adopting authorities, and may also help with the system's transparency and overall user trust.
CRJul 11, 2018
Differentially-Private "Draw and Discard" Machine LearningVasyl Pihur, Aleksandra Korolova, Frederick Liu et al.
In this work, we propose a novel framework for privacy-preserving client-distributed machine learning. It is motivated by the desire to achieve differential privacy guarantees in the local model of privacy in a way that satisfies all systems constraints using asynchronous client-server communication and provides attractive model learning properties. We call it "Draw and Discard" because it relies on random sampling of models for load distribution (scalability), which also provides additional server-side privacy protections and improved model quality through averaging. We present the mechanics of client and server components of "Draw and Discard" and demonstrate how the framework can be applied to learning Generalized Linear models. We then analyze the privacy guarantees provided by our approach against several types of adversaries and showcase experimental results that provide evidence for the framework's viability in practical deployments.
CRFeb 24, 2015
Sequential Aggregate Signatures with Short Public Keys without Random OraclesKwangsu Lee, Dong Hoon Lee, Moti Yung
The notion of aggregate signature has been motivated by applications and it enables any user to compress different signatures signed by different signers on different messages into a short signature. Sequential aggregate signature, in turn, is a special kind of aggregate signature that only allows a signer to add his signature into an aggregate signature in sequential order. This latter scheme has applications in diversified settings such as in reducing bandwidth of certificate chains and in secure routing protocols. Lu, Ostrovsky, Sahai, Shacham, and Waters (EUROCRYPT 2006) presented the first sequential aggregate signature scheme in the standard model. The size of their public key, however, is quite large (i.e., the number of group elements is proportional to the security parameter), and therefore, they suggested as an open problem the construction of such a scheme with short keys. In this paper, we propose the first sequential aggregate signature schemes with short public keys (i.e., a constant number of group elements) in prime order (asymmetric) bilinear groups that are secure under static assumptions in the standard model. Furthermore, our schemes employ a constant number of pairing operations per message signing and message verification operation. Technically, we start with a public-key signature scheme based on the recent dual system encryption technique of Lewko and Waters (TCC 2010). This technique cannot directly provide an aggregate signature scheme since, as we observed, additional elements should be published in a public key to support aggregation. Thus, our constructions are careful augmentation techniques for the dual system technique to allow it to support sequential aggregate signature schemes. We also propose a multi-signature scheme with short public parameters in the standard model.