Swanand Kadhe

LG
h-index33
17papers
776citations
Novelty60%
AI Score49

17 Papers

LGJul 12, 2022
Federated Unlearning: How to Efficiently Erase a Client in FL?

Anisa Halimi, Swanand Kadhe, Ambrish Rawat et al.

With privacy legislation empowering the users with the right to be forgotten, it has become essential to make a model amenable for forgetting some of its training data. However, existing unlearning methods in the machine learning context can not be directly applied in the context of distributed settings like federated learning due to the differences in learning protocol and the presence of multiple actors. In this paper, we tackle the problem of federated unlearning for the case of erasing a client by removing the influence of their entire local data from the trained global model. To erase a client, we propose to first perform local unlearning at the client to be erased, and then use the locally unlearned model as the initialization to run very few rounds of federated learning between the server and the remaining clients to obtain the unlearned global model. We empirically evaluate our unlearning method by employing multiple performance measures on three datasets, and demonstrate that our unlearning method achieves comparable performance as the gold standard unlearning method of federated retraining from scratch, while being significantly efficient. Unlike prior works, our unlearning method neither requires global access to the data used for training nor the history of the parameter updates to be stored by the server or any of the clients.

LGDec 16, 2022
Robust Learning Protocol for Federated Tumor Segmentation Challenge

Ambrish Rawat, Giulio Zizzo, Swanand Kadhe et al.

In this work, we devise robust and efficient learning protocols for orchestrating a Federated Learning (FL) process for the Federated Tumor Segmentation Challenge (FeTS 2022). Enabling FL for FeTS setup is challenging mainly due to data heterogeneity among collaborators and communication cost of training. To tackle these challenges, we propose Robust Learning Protocol (RoLePRO) which is a combination of server-side adaptive optimisation (e.g., server-side Adam) and judicious parameter (weights) aggregation schemes (e.g., adaptive weighted aggregation). RoLePRO takes a two-phase approach, where the first phase consists of vanilla Federated Averaging, while the second phase consists of a judicious aggregation scheme that uses a sophisticated reweighting, all in the presence of an adaptive optimisation algorithm at the server. We draw insights from extensive experimentation to tune learning rates for the two phases.

CRFeb 22, 2025Code
Protecting Users From Themselves: Safeguarding Contextual Privacy in Interactions with Conversational Agents

Ivoline Ngong, Swanand Kadhe, Hao Wang et al.

Conversational agents are increasingly woven into individuals' personal lives, yet users often underestimate the privacy risks associated with them. The moment users share information with these agents-such as large language models (LLMs)-their private information becomes vulnerable to exposure. In this paper, we characterize the notion of contextual privacy for user interactions with LLM-based Conversational Agents (LCAs). It aims to minimize privacy risks by ensuring that users (sender) disclose only information that is both relevant and necessary for achieving their intended goals when interacting with LCAs (untrusted receivers). Through a formative design user study, we observe how even "privacy-conscious" users inadvertently reveal sensitive information through indirect disclosures. Based on insights from this study, we propose a locally deployable framework that operates between users and LCAs, identifying and reformulating out-of-context information in user prompts. Our evaluation using examples from ShareGPT shows that lightweight models can effectively implement this framework, achieving strong gains in contextual privacy while preserving the user's intended interaction goals. Notably, about 76% of participants in our human evaluation preferred the reformulated prompts over the original ones, validating the usability and effectiveness of contextual privacy in our proposed framework. We opensource the code at https://github.com/IBM/contextual-privacy-LLM.

CRMar 5
AgentSCOPE: Evaluating Contextual Privacy Across Agentic Workflows

Ivoline C. Ngong, Keerthiram Murugesan, Swanand Kadhe et al.

Agentic systems are increasingly acting on users' behalf, accessing calendars, email, and personal files to complete everyday tasks. Privacy evaluation for these systems has focused on the input and output boundaries, but each task involves several intermediate information flows, from agent queries to tool responses, that are not currently evaluated. We argue that every boundary in an agentic pipeline is a site of potential privacy violation and must be assessed independently. To support this, we introduce the Privacy Flow Graph, a Contextual Integrity-grounded framework that decomposes agentic execution into a sequence of information flows, each annotated with the five CI parameters, and traces violations to their point of origin. We present AgentSCOPE, a benchmark of 62 multi-tool scenarios across eight regulatory domains with ground truth at every pipeline stage. Our evaluation across seven state-of-the-art LLMs show that privacy violations in the pipeline occur in over 80% of scenarios, even when final outputs appear clean (24%), with most violations arising at the tool-response stage where APIs return sensitive data indiscriminately. These results indicate that output-level evaluation alone substantially underestimates the privacy risk of agentic systems.

LGJun 16, 2025
Membership Inference Attacks as Privacy Tools: Reliability, Disparity and Ensemble

Zhiqi Wang, Chengyu Zhang, Yuetian Chen et al.

Membership inference attacks (MIAs) pose a significant threat to the privacy of machine learning models and are widely used as tools for privacy assessment, auditing, and machine unlearning. While prior MIA research has primarily focused on performance metrics such as AUC, accuracy, and TPR@low FPR - either by developing new methods to enhance these metrics or using them to evaluate privacy solutions - we found that it overlooks the disparities among different attacks. These disparities, both between distinct attack methods and between multiple instantiations of the same method, have crucial implications for the reliability and completeness of MIAs as privacy evaluation tools. In this paper, we systematically investigate these disparities through a novel framework based on coverage and stability analysis. Extensive experiments reveal significant disparities in MIAs, their potential causes, and their broader implications for privacy evaluation. To address these challenges, we propose an ensemble framework with three distinct strategies to harness the strengths of state-of-the-art MIAs while accounting for their disparities. This framework not only enables the construction of more powerful attacks but also provides a more robust and comprehensive methodology for privacy evaluation.

LGMay 3, 2023
LESS-VFL: Communication-Efficient Feature Selection for Vertical Federated Learning

Timothy Castiglia, Yi Zhou, Shiqiang Wang et al.

We propose LESS-VFL, a communication-efficient feature selection method for distributed systems with vertically partitioned data. We consider a system of a server and several parties with local datasets that share a sample ID space but have different feature sets. The parties wish to collaboratively train a model for a prediction task. As part of the training, the parties wish to remove unimportant features in the system to improve generalization, efficiency, and explainability. In LESS-VFL, after a short pre-training period, the server optimizes its part of the global model to determine the relevant outputs from party models. This information is shared with the parties to then allow local feature selection without communication. We analytically prove that LESS-VFL removes spurious features from model training. We provide extensive empirical evidence that LESS-VFL can achieve high accuracy and remove spurious features at a fraction of the communication cost of other feature selection approaches.

LGOct 14, 2021
Leveraging Spatial and Temporal Correlations in Sparsified Mean Estimation

Divyansh Jhunjhunwala, Ankur Mallick, Advait Gadhikar et al.

We study the problem of estimating at a central server the mean of a set of vectors distributed across several nodes (one vector per node). When the vectors are high-dimensional, the communication cost of sending entire vectors may be prohibitive, and it may be imperative for them to use sparsification techniques. While most existing work on sparsified mean estimation is agnostic to the characteristics of the data vectors, in many practical applications such as federated learning, there may be spatial correlations (similarities in the vectors sent by different nodes) or temporal correlations (similarities in the data sent by a single node over different iterations of the algorithm) in the data vectors. We leverage these correlations by simply modifying the decoding method used by the server to estimate the mean. We provide an analysis of the resulting estimation error as well as experiments for PCA, K-Means and Logistic Regression, which show that our estimators consistently outperform more sophisticated and expensive sparsification methods.

CROct 1, 2020
CoVer: Collaborative Light-Node-Only Verification and Data Availability for Blockchains

Steven Cao, Swanand Kadhe, Kannan Ramchandran

Validating a blockchain incurs heavy computation, communication, and storage costs. As a result, clients with limited resources, called light nodes, cannot verify transactions independently and must trust full nodes, making them vulnerable to security attacks. Motivated by this problem, we ask a fundamental question: can light nodes securely validate without any full nodes? We answer affirmatively by proposing CoVer, a decentralized protocol that allows a group of light nodes to collaboratively verify blocks even under a dishonest majority, achieving the same level of security for block validation as full nodes while only requiring a fraction of the work. In particular, work per node scales down proportionally with the number of participants (up to a log factor), resulting in computation, communication, and storage requirements that are sublinear in block size. Our main contributions are light-node-only protocols for fraud proofs and data availability.

CRSep 23, 2020
FastSecAgg: Scalable Secure Aggregation for Privacy-Preserving Federated Learning

Swanand Kadhe, Nived Rajaraman, O. Ozan Koyluoglu et al.

Recent attacks on federated learning demonstrate that keeping the training data on clients' devices does not provide sufficient privacy, as the model parameters shared by clients can leak information about their training data. A 'secure aggregation' protocol enables the server to aggregate clients' models in a privacy-preserving manner. However, existing secure aggregation protocols incur high computation/communication costs, especially when the number of model parameters is larger than the number of clients participating in an iteration -- a typical scenario in federated learning. In this paper, we propose a secure aggregation protocol, FastSecAgg, that is efficient in terms of computation and communication, and robust to client dropouts. The main building block of FastSecAgg is a novel multi-secret sharing scheme, FastShare, based on the Fast Fourier Transform (FFT), which may be of independent interest. FastShare is information-theoretically secure, and achieves a trade-off between the number of secrets, privacy threshold, and dropout tolerance. Riding on the capabilities of FastShare, we prove that FastSecAgg is (i) secure against the server colluding with 'any' subset of some constant fraction (e.g. $\sim10\%$) of the clients in the honest-but-curious setting; and (ii) tolerates dropouts of a 'random' subset of some constant fraction (e.g. $\sim10\%$) of the clients. FastSecAgg achieves significantly smaller computation cost than existing schemes while achieving the same (orderwise) communication cost. In addition, it guarantees security against adaptive adversaries, which can perform client corruptions dynamically during the execution of the protocol.

ITMay 14, 2020
Communication-Efficient Gradient Coding for Straggler Mitigation in Distributed Learning

Swanand Kadhe, O. Ozan Koyluoglu, Kannan Ramchandran

Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, need to overcome two limitations: delays caused by slow running machines called 'stragglers', and communication overheads. Recently, Ye and Abbe [ICML 2018] proposed a coding-theoretic paradigm to characterize a fundamental trade-off between computation load per worker, communication overhead per worker, and straggler tolerance. However, their proposed coding schemes suffer from heavy decoding complexity and poor numerical stability. In this paper, we develop a communication-efficient gradient coding framework to overcome these drawbacks. Our proposed framework enables using any linear code to design the encoding and decoding functions. When a particular code is used in this framework, its block-length determines the computation load, dimension determines the communication overhead, and minimum distance determines the straggler tolerance. The flexibility of choosing a code allows us to gracefully trade-off the straggler threshold and communication overhead for smaller decoding complexity and higher numerical stability. Further, we show that using a maximum distance separable (MDS) code generated by a random Gaussian matrix in our framework yields a gradient code that is optimal with respect to the trade-off and, in addition, satisfies stronger guarantees on numerical stability as compared to the previously proposed schemes. Finally, we evaluate our proposed framework on Amazon EC2 and demonstrate that it reduces the average iteration time by 16% as compared to prior gradient coding schemes.

LGNov 21, 2019
Communication-Efficient and Byzantine-Robust Distributed Learning with Error Feedback

Avishek Ghosh, Raj Kumar Maity, Swanand Kadhe et al.

We develop a communication-efficient distributed learning algorithm that is robust against Byzantine worker machines. We propose and analyze a distributed gradient-descent algorithm that performs a simple thresholding based on gradient norms to mitigate Byzantine failures. We show the (statistical) error-rate of our algorithm matches that of Yin et al.~\cite{dong}, which uses more complicated schemes (coordinate-wise median, trimmed mean). Furthermore, for communication efficiency, we consider a generic class of $δ$-approximate compressors from Karimireddi et al.~\cite{errorfeed} that encompasses sign-based compressors and top-$k$ sparsification. Our algorithm uses compressed gradients and gradient norms for aggregation and Byzantine removal respectively. We establish the statistical error rate for non-convex smooth loss functions. We show that, in certain range of the compression factor $δ$, the (order-wise) rate of convergence is not affected by the compression operation. Moreover, we analyze the compressed gradient descent algorithm with error feedback (proposed in \cite{errorfeed}) in a distributed setting and in the presence of Byzantine worker machines. We show that exploiting error feedback improves the statistical error rate. Finally, we experimentally validate our results and show good performance in convergence for convex (least-square regression) and non-convex (neural network training) problems.

ITJul 1, 2019
On an Equivalence Between Single-Server PIR with Side Information and Locally Recoverable Codes

Swanand Kadhe, Anoosheh Heidarzadeh, Alex Sprintson et al.

Private Information Retrieval (PIR) problem has recently attracted a significant interest in the information-theory community. In this problem, a user wants to privately download one or more messages belonging to a database with copies stored on a single or multiple remote servers. In the single server scenario, the user must have prior side information, i.e., a subset of messages unknown to the server, to be able to privately retrieve the required messages in an efficient way. In the last decade, there has also been a significant interest in Locally Recoverable Codes (LRC), a class of storage codes in which each symbol can be recovered from a limited number of other symbols. More recently, there is an interest in 'cooperative' locally recoverable codes, i.e., codes in which multiple symbols can be recovered from a small set of other code symbols. In this paper, we establish a relationship between coding schemes for the single-server PIR problem and LRCs. In particular, we show the following results: (i) PIR schemes designed for retrieving a single message are equivalent to classical LRCs; and (ii) PIR schemes for retrieving multiple messages are equivalent to cooperative LRCs. These equivalence results allow us to recover upper bounds on the download rate for PIR-SI schemes, and to obtain a novel rate upper bound on cooperative LRCs. We show results for both linear and non-linear codes.

CRJun 28, 2019
SeF: A Secure Fountain Architecture for Slashing Storage Costs in Blockchains

Swanand Kadhe, Jichan Chung, Kannan Ramchandran

Full nodes, which synchronize the entire blockchain history and independently validate all the blocks, form the backbone of any blockchain network by playing a vital role in ensuring security properties. On the other hand, a user running a full node needs to pay a heavy price in terms of storage costs. E.g., the Bitcoin blockchain size has grown over 215GB, in spite of its low throughput. The ledger size for a high throughput blockchain Ripple has already reached 9TB, and it is growing at an astonishing rate of 12GB per day! In this paper, we propose an architecture based on 'fountain codes', a class of erasure codes, that enables any full node to 'encode' validated blocks into a small number of 'coded blocks', thereby reducing its storage costs by orders of magnitude. In particular, our proposed "Secure Fountain (SeF)" architecture can achieve a near-optimal trade-off between the storage savings per node and the 'bootstrap cost' in terms of the number of (honest) storage-constrained nodes a new node needs to contact to recover the blockchain. A key technical innovation in SeF codes is to make fountain codes secure against adversarial nodes that can provide maliciously formed coded blocks. Our idea is to use the header-chain as a 'side-information' to check whether a coded block is maliciously formed while it is getting decoded. Further, the 'rateless property' of fountain codes helps in achieving high decentralization and scalability. Our experiments demonstrate that SeF codes tuned to achieve 1000x storage savings enable full nodes to encode the 191GB Bitcoin blockchain into 195MB on average. A new node can recover the blockchain from an arbitrary set of storage-constrained nodes as long as the set contains ~1100 honest nodes on average. Note that for a 1000x storage savings, the fundamental bound on the number of honest nodes to contact is 1000: we need about 10% more in practice.

ITApr 30, 2019
Gradient Coding Based on Block Designs for Mitigating Adversarial Stragglers

Swanand Kadhe, O. Ozan Koyluoglu, Kannan Ramchandran

Distributed implementations of gradient-based methods, wherein a server distributes gradient computations across worker machines, suffer from slow running machines, called 'stragglers'. Gradient coding is a coding-theoretic framework to mitigate stragglers by enabling the server to recover the gradient sum in the presence of stragglers. 'Approximate gradient codes' are variants of gradient codes that reduce computation and storage overhead per worker by allowing the server to approximately reconstruct the gradient sum. In this work, our goal is to construct approximate gradient codes that are resilient to stragglers selected by a computationally unbounded adversary. Our motivation for constructing codes to mitigate adversarial stragglers stems from the challenge of tackling stragglers in massive-scale elastic and serverless systems, wherein it is difficult to statistically model stragglers. Towards this end, we propose a class of approximate gradient codes based on balanced incomplete block designs (BIBDs). We show that the approximation error for these codes depends only on the number of stragglers, and thus, adversarial straggler selection has no advantage over random selection. In addition, the proposed codes admit computationally efficient decoding at the server. Next, to characterize fundamental limits of adversarial straggling, we consider the notion of 'adversarial threshold' -- the smallest number of workers that an adversary must straggle to inflict certain approximation error. We compute a lower bound on the adversarial threshold, and show that codes based on symmetric BIBDs maximize this lower bound among a wide class of codes, making them excellent candidates for mitigating adversarial stragglers.

DCMar 21, 2019
OverSketched Newton: Fast Convex Optimization for Serverless Systems

Vipul Gupta, Swanand Kadhe, Thomas Courtade et al.

Motivated by recent developments in serverless systems for large-scale computation as well as improvements in scalable randomized matrix algorithms, we develop OverSketched Newton, a randomized Hessian-based optimization algorithm to solve large-scale convex optimization problems in serverless systems. OverSketched Newton leverages matrix sketching ideas from Randomized Numerical Linear Algebra to compute the Hessian approximately. These sketching methods lead to inbuilt resiliency against stragglers that are a characteristic of serverless architectures. Depending on whether the problem is strongly convex or not, we propose different iteration updates using the approximate Hessian. For both cases, we establish convergence guarantees for OverSketched Newton and empirically validate our results by solving large-scale supervised learning problems on real-world datasets. Experiments demonstrate a reduction of ~50% in total running time on AWS Lambda, compared to state-of-the-art distributed optimization schemes.

ITOct 18, 2017
Universally Weakly Secure Coset Coding Schemes for Minimum Storage Regenerating (MSR) Codes

Swanand Kadhe, Alex Sprintson

We consider the problem of designing codes for distributed storage that protect user data against eavesdroppers that can gain access to network links as well as individual nodes. Our goal is to achieve weak security (also known as block security) that requires that the eavesdroppers would not be able to decode individual files or combinations of a small number of files. The standard approach for achieving block security is to use a joint design scheme that consists of (inner) storage code and the (outer) coset code. However, jointly designing the codes requires that the user, who pre-processes and stores the files, should know the underlying storage code in order to design the (outer) linear transformation for achieving weak security. In many practical scenarios, such as storing the files on the third party cloud storage system, it may not be possible for the user to know the underlying storage code. In this work, we present universal schemes that separate the outer code design from the storage code design for minimum storage regenerating codes (MSR). Our schemes allow the independent design of the storage code and the outer code. Our schemes use small field size and can be used in a broad range of practical settings.

ITSep 1, 2017
Private Information Retrieval with Side Information

Swanand Kadhe, Brenden Garcia, Anoosheh Heidarzadeh et al.

We study the problem of Private Information Retrieval (PIR) in the presence of prior side information. The problem setup includes a database of $K$ independent messages possibly replicated on several servers, and a user that needs to retrieve one of these messages. In addition, the user has some prior side information in the form of a subset of $M$ messages, not containing the desired message and unknown to the servers. This problem is motivated by practical settings in which the user can obtain side information opportunistically from other users or has previously downloaded some messages using classical PIR schemes. The objective of the user is to retrieve the required message without revealing its identity while minimizing the amount of data downloaded from the servers. We focus on achieving information-theoretic privacy in two scenarios: (i) the user wants to protect jointly its demand and side information; (ii) the user wants to protect only the information about its demand, but not the side information. To highlight the role of side information, we focus first on the case of a single server (single database). In the first scenario, we prove that the minimum download cost is $K-M$ messages, and in the second scenario it is $\lceil \frac{K}{M+1}\rceil$ messages, which should be compared to $K$ messages, the minimum download cost in the case of no side information. Then, we extend some of our results to the case of the database replicated on multiple servers. Our proof techniques relate PIR with side information to the index coding problem. We leverage this connection to prove converse results, as well as to design achievability schemes.