Bargav Jayaraman

h-index10

9papers

533citations

Novelty53%

AI Score40

Ranked #74,297 of 194,257 authors (top 38%)#1,774 in CR (top 26%)

9 Papers

30.9CRSep 2, 2022Code

Are Attribute Inference Attacks Just Imputation?

Bargav Jayaraman, David Evans

Models can expose sensitive information about their training data. In an attribute inference attack, an adversary has partial knowledge of some training records and access to a model trained on those records, and infers the unknown values of a sensitive feature of those records. We study a fine-grained variant of attribute inference we call \emph{sensitive value inference}, where the adversary's goal is to identify with high confidence some records from a candidate set where the unknown attribute has a particular sensitive value. We explicitly compare attribute inference with data imputation that captures the training distribution statistics, under various assumptions about the training data available to the adversary. Our main conclusions are: (1) previous attribute inference methods do not reveal more about the training data from the model than can be inferred by an adversary without access to the trained model, but with the same knowledge of the underlying distribution as needed to train the attribute inference attack; (2) black-box attribute inference attacks rarely learn anything that cannot be learned without the model; but (3) white-box attacks, which we introduce and evaluate in the paper, can reliably identify some records with the sensitive value attribute that would not be predicted without having access to the model. Furthermore, we show that proposed defenses such as differentially private training and removing vulnerable records from training do not mitigate this privacy risk. The code for our experiments is available at \url{https://github.com/bargavj/EvaluatingDPML}.

15.3CRJul 14, 2022

Combing for Credentials: Active Pattern Extraction from Smart Reply

Bargav Jayaraman, Esha Ghosh, Melissa Chase et al.

Pre-trained large language models, such as GPT\nobreakdash-2 and BERT, are often fine-tuned to achieve state-of-the-art performance on a downstream task. One natural example is the ``Smart Reply'' application where a pre-trained model is tuned to provide suggested responses for a given query message. Since the tuning data is often sensitive data such as emails or chat transcripts, it is important to understand and mitigate the risk that the model leaks its tuning data. We investigate potential information leakage vulnerabilities in a typical Smart Reply pipeline. We consider a realistic setting where the adversary can only interact with the underlying model through a front-end interface that constrains what types of queries can be sent to the model. Previous attacks do not work in these settings, but require the ability to send unconstrained queries directly to the model. Even when there are no constraints on the queries, previous attacks typically require thousands, or even millions, of queries to extract useful information, while our attacks can extract sensitive data in just a handful of queries. We introduce a new type of active extraction attack that exploits canonical patterns in text containing sensitive data. We show experimentally that it is possible for an adversary to extract sensitive user information present in the training data, even in realistic settings where all interactions with the model must go through a front-end that limits the types of queries. We explore potential mitigation strategies and demonstrate empirically how differential privacy appears to be a reasonably effective defense mechanism to such pattern extraction attacks.

11.4LGApr 8, 2025Code

Measuring Déjà vu Memorization Efficiently

Narine Kokhlikyan, Bargav Jayaraman, Florian Bordes et al.

Recent research has shown that representation learning models may accidentally memorize their training data. For example, the déjà vu method shows that for certain representation learning models and training images, it is sometimes possible to correctly predict the foreground label given only the representation of the background - better than through dataset-level correlations. However, their measurement method requires training two models - one to estimate dataset-level correlations and the other to estimate memorization. This multiple model setup becomes infeasible for large open-source models. In this work, we propose alternative simple methods to estimate dataset-level correlations, and show that these can be used to approximate an off-the-shelf model's memorization ability without any retraining. This enables, for the first time, the measurement of memorization in pre-trained open-source image representation and vision-language representation models. Our results show that different ways of measuring memorization yield very similar aggregate results. We also find that open-source models typically have lower aggregate memorization than similar models trained on a subset of the data. The code is available both for vision and vision language models.

38.2CRMay 21, 2020Code

Revisiting Membership Inference Under Realistic Assumptions

Bargav Jayaraman, Lingxiao Wang, Katherine Knipmeyer et al.

We study membership inference in settings where some of the assumptions typically used in previous research are relaxed. First, we consider skewed priors, to cover cases such as when only a small fraction of the candidate pool targeted by the adversary are actually members and develop a PPV-based metric suitable for this setting. This setting is more realistic than the balanced prior setting typically considered by researchers. Second, we consider adversaries that select inference thresholds according to their attack goals and develop a threshold selection procedure that improves inference attacks. Since previous inference attacks fail in imbalanced prior setting, we develop a new inference attack based on the intuition that inputs corresponding to training set members will be near a local minimum in the loss function, and show that an attack that combines this with thresholds on the per-instance loss can achieve high PPV even in settings where other attacks appear to be ineffective. Code for our experiments can be found here: https://github.com/bargavj/EvaluatingDPML.

37.5LGFeb 24, 2019Code

Evaluating Differentially Private Machine Learning in Practice

Bargav Jayaraman, David Evans

Differential privacy is a strong notion for privacy that can be used to prove formal guarantees, in terms of a privacy budget, $ε$, about how much information is leaked by a mechanism. However, implementations of privacy-preserving machine learning often select large values of $ε$ in order to get acceptable utility of the model, with little understanding of the impact of such choices on meaningful privacy. Moreover, in scenarios where iterative learning procedures are used, differential privacy variants that offer tighter analyses are used which appear to reduce the needed privacy budget but present poorly understood trade-offs between privacy and utility. In this paper, we quantify the impact of these choices on privacy in experiments with logistic regression and neural network models. Our main finding is that there is a huge gap between the upper bounds on privacy loss that can be guaranteed, even with advanced mechanisms, and the effective privacy loss that can be measured using current inference attacks. Current mechanisms for differentially private machine learning rarely offer acceptable utility-privacy trade-offs with guarantees for complex learning tasks: settings that provide limited accuracy loss provide meaningless privacy guarantees, and settings that provide strong privacy guarantees result in useless models. Code for the experiments can be found here: https://github.com/bargavj/EvaluatingDPML

10.5CVFeb 3, 2024

Déjà Vu Memorization in Vision-Language Models

Bargav Jayaraman, Chuan Guo, Kamalika Chaudhuri

Vision-Language Models (VLMs) have emerged as the state-of-the-art representation learning solution, with myriads of downstream applications such as image classification, retrieval and generation. A natural question is whether these models memorize their training data, which also has implications for generalization. We propose a new method for measuring memorization in VLMs, which we call déjà vu memorization. For VLMs trained on image-caption pairs, we show that the model indeed retains information about individual objects in the training images beyond what can be inferred from correlations or the image caption. We evaluate déjà vu memorization at both sample and population level, and show that it is significant for OpenCLIP trained on as many as 50M image-caption pairs. Finally, we show that text randomization considerably mitigates memorization while only moderately impacting the model's downstream task performance.

12.0CRMay 28, 2025

Permissioned LLMs: Enforcing Access Control in Large Language Models

Bargav Jayaraman, Virendra J. Marathe, Hamid Mozaffari et al.

In enterprise settings, organizational data is segregated, siloed and carefully protected by elaborate access control frameworks. These access control structures can completely break down if an LLM fine-tuned on the siloed data serves requests, for downstream tasks, from individuals with disparate access privileges. We propose Permissioned LLMs (PermLLM), a new class of LLMs that superimpose the organizational data access control structures on query responses they generate. We formalize abstractions underpinning the means to determine whether access control enforcement happens correctly over LLM query responses. Our formalism introduces the notion of a relevant response that can be used to prove whether a PermLLM mechanism has been implemented correctly. We also introduce a novel metric, called access advantage, to empirically evaluate the efficacy of a PermLLM mechanism. We introduce three novel PermLLM mechanisms that build on Parameter Efficient Fine-Tuning to achieve the desired access control. We furthermore present two instantiations of access advantage--(i) Domain Distinguishability Index (DDI) based on Membership Inference Attacks, and (ii) Utility Gap Index (UGI) based on LLM utility evaluation. We demonstrate the efficacy of our PermLLM mechanisms through extensive experiments on five public datasets (GPQA, RCV1, SimpleQA, WMDP, and PubMedQA), in addition to evaluating the validity of DDI and UGI metrics themselves for quantifying access control in LLMs.

19.5LGOct 30, 2019

Efficient Privacy-Preserving Stochastic Nonconvex Optimization

Lingxiao Wang, Bargav Jayaraman, David Evans et al.

While many solutions for privacy-preserving convex empirical risk minimization (ERM) have been developed, privacy-preserving nonconvex ERM remains a challenge. We study nonconvex ERM, which takes the form of minimizing a finite-sum of nonconvex loss functions over a training set. We propose a new differentially private stochastic gradient descent algorithm for nonconvex ERM that achieves strong privacy guarantees efficiently, and provide a tight analysis of its privacy and utility guarantees, as well as its gradient complexity. Our algorithm reduces gradient complexity while improves the best previous utility guarantee given by Wang et al. (NeurIPS 2017). Our experiments on benchmark nonconvex ERM problems demonstrate superior performance in terms of both training cost and utility gains compared with previous differentially private methods using the same privacy budgets.

2.5CRJun 11, 2017

Decentralized Certificate Authorities

Bargav Jayaraman, Hannah Li, David Evans

The security of TLS depends on trust in certificate authorities, and that trust stems from their ability to protect and control the use of a private signing key. The signing key is the key asset of a certificate authority (CA), and its value is based on trust in the corresponding public key which is primarily distributed by browser vendors. Compromise of a CA private key represents a single point-of-failure that could have disastrous consequences, so CAs go to great lengths to attempt to protect and control the use of their private keys. Nevertheless, keys are sometimes compromised and may be misused accidentally or intentionally by insiders. We propose splitting a CA's private key among multiple parties, and producing signatures using a generic secure multi-party computation protocol that never exposes the actual signing key. This could be used by a single CA to reduce the risk that its signing key would be compromised or misused. It could also enable new models for certificate generation, where multiple CAs would need to agree and cooperate before a new certificate can be generated, or even where certificate generation would require cooperation between a CA and the certificate recipient (subject). Although more efficient solutions are possible with custom protocols, we demonstrate the feasibility of implementing a decentralized CA using a generic two-party secure computation protocol with an evaluation of a prototype implementation that uses secure two-party computation to generate certificates signed using ECDSA on curve secp192k1.