ITAug 8, 2023
Iterative Sketching for Secure Coded RegressionNeophytos Charalambides, Hessam Mahdavifar, Mert Pilanci et al.
Linear regression is a fundamental and primitive problem in supervised machine learning, with applications ranging from epidemiology to finance. In this work, we propose methods for speeding up distributed linear regression. We do so by leveraging randomized techniques, while also ensuring security and straggler resiliency in asynchronous distributed computing systems. Specifically, we randomly rotate the basis of the system of equations and then subsample blocks, to simultaneously secure the information and reduce the dimension of the regression problem. In our setup, the basis rotation corresponds to an encoded encryption in an approximate gradient coding scheme, and the subsampling corresponds to the responses of the non-straggling servers in the centralized coded computing framework. This results in a distributive iterative stochastic approach for matrix compression and steepest descent.
ITJan 16, 2023
Machine Learning-Aided Efficient Decoding of Reed-Muller SubcodesMohammad Vahid Jamali, Xiyang Liu, Ashok Vardhan Makkuva et al.
Reed-Muller (RM) codes achieve the capacity of general binary-input memoryless symmetric channels and are conjectured to have a comparable performance to that of random codes in terms of scaling laws. However, such results are established assuming maximum-likelihood decoders for general code parameters. Also, RM codes only admit limited sets of rates. Efficient decoders such as successive cancellation list (SCL) decoder and recently-introduced recursive projection-aggregation (RPA) decoders are available for RM codes at finite lengths. In this paper, we focus on subcodes of RM codes with flexible rates. We first extend the RPA decoding algorithm to RM subcodes. To lower the complexity of our decoding algorithm, referred to as subRPA, we investigate different approaches to prune the projections. Next, we derive the soft-decision based version of our algorithm, called soft-subRPA, that not only improves upon the performance of subRPA but also enables a differentiable decoding algorithm. Building upon the soft-subRPA algorithm, we then provide a framework for training a machine learning (ML) model to search for \textit{good} sets of projections that minimize the decoding error rate. Training our ML model enables achieving very close to the performance of full-projection decoding with a significantly smaller number of projections. We also show that the choice of the projections in decoding RM subcodes matters significantly, and our ML-aided projection pruning scheme is able to find a \textit{good} selection, i.e., with negligible performance degradation compared to the full-projection case, given a reasonable number of projections.
ITMar 31
Efficient Low-Memory Fast Stack Decoding with Variance Polarization for PAC CodesMohsen Moradi, Hessam Mahdavifar
Polarization-adjusted convolutional (PAC) codes have recently emerged as a promising class of error-correcting codes, achieving near-capacity performance particularly in the short block-length regime. In this paper, we propose an enhanced stack decoding algorithm for PAC codes that significantly improves parallelization by exploiting specialized bit nodes, such as rate-0 and rate-1 nodes. For a rate-1 node with $N_0$ leaf nodes in its corresponding subtree, conventional stack decoding must either explore all $2^{N_0}$ paths, or, same as in fast list decoding, restrict attention to a constant number of candidate paths. In contrast, our approach introduces a pruning technique that removes candidate paths with small path metrics while ensuring that the probability of pruning the correct path decays exponentially with the threshold. Furthermore, we propose a novel approximation method for estimating variance polarization under the binary-input additive white Gaussian noise (BI-AWGN) channel. Leveraging these approximations, we develop an efficient stack-pruning strategy that selectively preserves decoding paths whose bit-metric values align with their expected means. This targeted pruning substantially reduces the number of active paths in the stack, thereby decreasing both decoding latency and computational complexity. Numerical results demonstrate that for a PAC$(128,64)$ code, our method achieves up to a 70\% reduction in the average number of paths without degrading error-correction performance.
ITMay 13
Univariate Bicycle Quantum LDPC Codes: Explicit Logical Structure and Distance BoundsSheida Rabeti, Hessam Mahdavifar
We introduce univariate bicycle (UB) codes, a structured subclass of generalized bicycle (GB) quantum low-density parity-check (LDPC) codes obtained via a Frobenius relation. This construction reduces the code design space from a two-polynomial search in GB codes to a single-polynomial search, while preserving sparsity. We provide an explicit algebraic characterization of the logical coset spaces by constructing a basis for the logical quotient space, yielding a complete parametrization of logical operators. Leveraging this structure, we derive upper bounds on the minimum distance by relating structured logical representatives to cycle-density properties of associated circulant matrices. Finally, simulation results for short- to medium-length UB codes (block lengths ranging from a few hundred to approximately $10^3$) demonstrate competitive performance relative to existing GB and bivariate bicycle (BB) codes despite the additional algebraic restriction.
ITMay 13
Multiple-Bases Belief Propagation List Decoding for Quantum LDPC CodesSheida Rabeti, Hessam Mahdavifar
In this paper, we propose a belief-propagation (BP)-based decoder, termed the Multiple-Bases Belief-Propagation List Decoder (MBBP-LD), for quantum low-density parity-check (QLDPC) codes. The key idea is to generate \emph{structured decoding diversity} by constructing multiple redundant parity-check representations via cycle-free subtree decompositions of the Tanner graph, and running BP decoding in parallel across these representations. This extends the classical Multiple-Bases Belief-Propagation (MBBP) framework to the quantum setting while preserving the linear-time complexity and efficiency of standard BP decoding, and avoids the need for super-linear post-processing. Simulation results demonstrate that MBBP-LD improves upon existing BP-based decoders, including BP with ordered statistics decoding (BP-OSD) and belief propagation with guided decimation (BPGD) across several QLDPC codes, while requiring substantially fewer total BP iterations. For bivariate bicycle codes $[[144,12,12]]$ and $[[288,12,18]]$, MBBP-LD achieves up to $20\%$ reduction in error rate compared to BPGD and up to $30\%$ compared to BP-OSD in the low- and moderate-error regimes. For the larger B1 code $[[882, 24, 18 \leq d \leq 24]]$, MBBP-LD attains comparable or improved performance relative to BPGD while maintaining BP-like decoding latency under parallel implementation.
CRJun 5, 2024Code
DREW : Towards Robust Data Provenance by Leveraging Error-Controlled WatermarkingMehrdad Saberi, Vinu Sankar Sadasivan, Arman Zarei et al.
Identifying the origin of data is crucial for data provenance, with applications including data ownership protection, media forensics, and detecting AI-generated content. A standard approach involves embedding-based retrieval techniques that match query data with entries in a reference dataset. However, this method is not robust against benign and malicious edits. To address this, we propose Data Retrieval with Error-corrected codes and Watermarking (DREW). DREW randomly clusters the reference dataset, injects unique error-controlled watermark keys into each cluster, and uses these keys at query time to identify the appropriate cluster for a given sample. After locating the relevant cluster, embedding vector similarity retrieval is performed within the cluster to find the most accurate matches. The integration of error control codes (ECC) ensures reliable cluster assignments, enabling the method to perform retrieval on the entire dataset in case the ECC algorithm cannot detect the correct cluster with high confidence. This makes DREW maintain baseline performance, while also providing opportunities for performance improvements due to the increased likelihood of correctly matching queries to their origin when performing retrieval on a smaller subset of the dataset. Depending on the watermark technique used, DREW can provide substantial improvements in retrieval accuracy (up to 40\% for some datasets and modification types) across multiple datasets and state-of-the-art embedding models (e.g., DinoV2, CLIP), making our method a promising solution for secure and reliable source identification. The code is available at https://github.com/mehrdadsaberi/DREW
ITAug 29, 2021Code
KO codes: Inventing Nonlinear Encoding and Decoding for Reliable Wireless Communication via Deep-learningAshok Vardhan Makkuva, Xiyang Liu, Mohammad Vahid Jamali et al.
Landmark codes underpin reliable physical layer communication, e.g., Reed-Muller, BCH, Convolution, Turbo, LDPC and Polar codes: each is a linear code and represents a mathematical breakthrough. The impact on humanity is huge: each of these codes has been used in global wireless communication standards (satellite, WiFi, cellular). Reliability of communication over the classical additive white Gaussian noise (AWGN) channel enables benchmarking and ranking of the different codes. In this paper, we construct KO codes, a computationaly efficient family of deep-learning driven (encoder, decoder) pairs that outperform the state-of-the-art reliability performance on the standardized AWGN channel. KO codes beat state-of-the-art Reed-Muller and Polar codes, under the low-complexity successive cancellation decoding, in the challenging short-to-medium block length regime on the AWGN channel. We show that the gains of KO codes are primarily due to the nonlinear mapping of information bits directly to transmit real symbols (bypassing modulation) and yet possess an efficient, high performance decoder. The key technical innovation that renders this possible is design of a novel family of neural architectures inspired by the computation tree of the {\bf K}ronecker {\bf O}peration (KO) central to Reed-Muller and Polar codes. These architectures pave way for the discovery of a much richer class of hitherto unexplored nonlinear algebraic structures. The code is available at \href{https://github.com/deepcomm/KOcodes}{https://github.com/deepcomm/KOcodes}
CRJun 21, 2025
A Locally Differential Private Coding-Assisted Succinct Histogram ProtocolHsuan-Po Liu, Hessam Mahdavifar
A succinct histogram captures frequent items and their frequencies across clients and has become increasingly important for large-scale, privacy-sensitive machine learning applications. To develop a rigorous framework to guarantee privacy for the succinct histogram problem, local differential privacy (LDP) has been utilized and shown promising results. To preserve data utility under LDP, which essentially works by intentionally adding noise to data, error-correcting codes naturally emerge as a promising tool for reliable information collection. This work presents the first practical $(ε,δ)$-LDP protocol for constructing succinct histograms using error-correcting codes. To this end, polar codes and their successive-cancellation list (SCL) decoding algorithms are leveraged as the underlying coding scheme. More specifically, our protocol introduces Gaussian-based perturbations to enable efficient soft decoding. Experiments demonstrate that our approach outperforms prior methods, particularly for items with low true frequencies, while maintaining similar frequency estimation accuracy.
ITJan 21, 2022
Orthonormal Sketches for Secure Coded RegressionNeophytos Charalambides, Hessam Mahdavifar, Mert Pilanci et al.
In this work, we propose a method for speeding up linear regression distributively, while ensuring security. We leverage randomized sketching techniques, and improve straggler resilience in asynchronous systems. Specifically, we apply a random orthonormal matrix and then subsample in \textit{blocks}, to simultaneously secure the information and reduce the dimension of the regression problem. In our setup, the transformation corresponds to an encoded encryption in an \textit{approximate} gradient coding scheme, and the subsampling corresponds to the responses of the non-straggling workers; in a centralized coded computing network. We focus on the special case of the \textit{Subsampled Randomized Hadamard Transform}, which we generalize to block sampling; and discuss how it can be used to secure the data. We illustrate the performance through numerical experiments.
LGOct 28, 2021
Federated Learning with Heterogeneous Differential PrivacyNasser Aldaghri, Hessam Mahdavifar, Ahmad Beirami
Federated learning (FL) takes a first step towards privacy-preserving machine learning by training models while keeping client data local. Models trained using FL may still leak private client information through model updates during training. Differential privacy (DP) may be employed on model updates to provide privacy guarantees within FL, typically at the cost of degraded performance of the final trained model. Both non-private FL and DP-FL can be solved using variants of the federated averaging (FedAvg) algorithm. In this work, we consider a heterogeneous DP setup where clients require varying degrees of privacy guarantees. First, we analyze the optimal solution to the federated linear regression problem with heterogeneous DP in a Bayesian setup. We find that unlike the non-private setup, where the optimal solution for homogeneous data amounts to a single global solution for all clients learned through FedAvg, the optimal solution for each client in this setup would be a personalized one even for homogeneous data. We also analyze the privacy-utility trade-off for this setup, where we characterize the gain obtained from heterogeneous privacy where some clients opt for less strict privacy guarantees. We propose a new algorithm for FL with heterogeneous DP, named FedHDP, which employs personalization and weighted averaging at the server using the privacy choices of clients, to achieve better performance on clients' local models. Through numerical experiments, we show that FedHDP provides up to $9.27\%$ performance gain compared to the baseline DP-FL for the considered datasets where $5\%$ of clients opt out of DP. Additionally, we show a gap in the average performance of local models between non-private and private clients of up to $3.49\%$, empirically illustrating that the baseline DP-FL might incur a large utility cost when not all clients require the stricter privacy guarantees.
LGSep 20, 2021
ApproxIFER: A Model-Agnostic Approach to Resilient and Robust Prediction Serving SystemsMahdi Soleymani, Ramy E. Ali, Hessam Mahdavifar et al.
Due to the surge of cloud-assisted AI services, the problem of designing resilient prediction serving systems that can effectively cope with stragglers/failures and minimize response delays has attracted much interest. The common approach for tackling this problem is replication which assigns the same prediction task to multiple workers. This approach, however, is very inefficient and incurs significant resource overheads. Hence, a learning-based approach known as parity model (ParM) has been recently proposed which learns models that can generate parities for a group of predictions in order to reconstruct the predictions of the slow/failed workers. While this learning-based approach is more resource-efficient than replication, it is tailored to the specific model hosted by the cloud and is particularly suitable for a small number of queries (typically less than four) and tolerating very few (mostly one) number of stragglers. Moreover, ParM does not handle Byzantine adversarial workers. We propose a different approach, named Approximate Coded Inference (ApproxIFER), that does not require training of any parity models, hence it is agnostic to the model hosted by the cloud and can be readily applied to different data domains and model architectures. Compared with earlier works, ApproxIFER can handle a general number of stragglers and scales significantly better with the number of queries. Furthermore, ApproxIFER is robust against Byzantine workers. Our extensive experiments on a large number of datasets and model architectures also show significant accuracy improvement by up to 58% over the parity model approaches.
ITJan 27, 2021
List-Decodable Coded Computing: Breaking the Adversarial Toleration BarrierMahdi Soleymani, Ramy E. Ali, Hessam Mahdavifar et al.
We consider the problem of coded computing, where a computational task is performed in a distributed fashion in the presence of adversarial workers. We propose techniques to break the adversarial toleration threshold barrier previously known in coded computing. More specifically, we leverage list-decoding techniques for folded Reed-Solomon codes and propose novel algorithms to recover the correct codeword using side information. In the coded computing setting, we show how the master node can perform certain carefully designed extra computations to obtain the side information. The workload of computing this side information is negligible compared to the computations done by each worker. This side information is then utilized to prune the output of the list decoder and uniquely recover the true outcome. We further propose folded Lagrange coded computing (FLCC) to incorporate the developed techniques into a specific coded computing setting. Our results show that FLCC outperforms LCC by breaking the barrier on the number of adversaries that can be tolerated. In particular, the corresponding threshold in FLCC is improved by a factor of two asymptotically compared to that of LCC.
LGDec 31, 2020
Coded Machine UnlearningNasser Aldaghri, Hessam Mahdavifar, Ahmad Beirami
There are applications that may require removing the trace of a sample from the system, e.g., a user requests their data to be deleted, or corrupted data is discovered. Simply removing a sample from storage units does not necessarily remove its entire trace since downstream machine learning models may store some information about the samples used to train them. A sample can be perfectly unlearned if we retrain all models that used it from scratch with that sample removed from their training dataset. When multiple such unlearning requests are expected to be served, unlearning by retraining becomes prohibitively expensive. Ensemble learning enables the training data to be split into smaller disjoint shards that are assigned to non-communicating weak learners. Each shard is used to produce a weak model. These models are then aggregated to produce the final central model. This setup introduces an inherent trade-off between performance and unlearning cost, as reducing the shard size reduces the unlearning cost but may cause degradation in performance. In this paper, we propose a coded learning protocol where we utilize linear encoders to encode the training data into shards prior to the learning phase. We also present the corresponding unlearning protocol and show that it satisfies the perfect unlearning criterion. Our experimental results show that the proposed coded machine unlearning provides a better performance versus unlearning cost trade-off compared to the uncoded baseline.
ITAug 19, 2020
Analog Lagrange Coded ComputingMahdi Soleymani, Hessam Mahdavifar, A. Salman Avestimehr
A distributed computing scenario is considered, where the computational power of a set of worker nodes is used to perform a certain computation task over a dataset that is dispersed among the workers. Lagrange coded computing (LCC), proposed by Yu et al., leverages the well-known Lagrange polynomial to perform polynomial evaluation of the dataset in such a scenario in an efficient parallel fashion while keeping the privacy of data amidst possible collusion of workers. This solution relies on quantizing the data into a finite field, so that Shamir's secret sharing, as one of its main building blocks, can be employed. Such a solution, however, is not properly scalable with the size of dataset, mainly due to computation overflows. To address such a critical issue, we propose a novel extension of LCC to the analog domain, referred to as analog LCC (ALCC). All the operations in the proposed ALCC protocol are done over the infinite fields of R/C but for practical implementations floating-point numbers are used. We characterize the privacy of data in ALCC, against any subset of colluding workers up to a certain size, in terms of the distinguishing security (DS) and the mutual information security (MIS) metrics. Also, the accuracy of outcome is characterized in a practical setting assuming operations are performed using floating-point numbers. Consequently, a fundamental trade-off between the accuracy of the outcome of ALCC and its privacy level is observed and is numerically evaluated. Moreover, we implement the proposed scheme to perform matrix-matrix multiplication over a batch of matrices. It is observed that ALCC is superior compared to the state-of-the-art LCC, implemented using fixed-point numbers, assuming both schemes use an equal number of bits to represent data symbols.
LGJul 17, 2020
Privacy-Preserving Distributed Learning in the Analog DomainMahdi Soleymani, Hessam Mahdavifar, A. Salman Avestimehr
We consider the critical problem of distributed learning over data while keeping it private from the computational servers. The state-of-the-art approaches to this problem rely on quantizing the data into a finite field, so that the cryptographic approaches for secure multiparty computing can then be employed. These approaches, however, can result in substantial accuracy losses due to fixed-point representation of the data and computation overflows. To address these critical issues, we propose a novel algorithm to solve the problem when data is in the analog domain, e.g., the field of real/complex numbers. We characterize the privacy of the data from both information-theoretic and cryptographic perspectives, while establishing a connection between the two notions in the analog domain. More specifically, the well-known connection between the distinguishing security (DS) and the mutual information security (MIS) metrics is extended from the discrete domain to the continues domain. This is then utilized to bound the amount of information about the data leaked to the servers in our protocol, in terms of the DS metric, using well-known results on the capacity of single-input multiple-output (SIMO) channel with correlated noise. It is shown how the proposed framework can be adopted to do computation tasks when data is represented using floating-point numbers. We then show that this leads to a fundamental trade-off between the privacy level of data and accuracy of the result. As an application, we also show how to train a machine learning model while keeping the data as well as the trained model private. Then numerical results are shown for experiments on the MNIST dataset. Furthermore, experimental advantages are shown comparing to fixed-point implementations over finite fields.
MLJun 25, 2019
Coding for Crowdsourced Classification with XOR QueriesJames Chin-Jen Pang, Hessam Mahdavifar, S. Sandeep Pradhan
This paper models the crowdsourced labeling/classification problem as a sparsely encoded source coding problem, where each query answer, regarded as a code bit, is the XOR of a small number of labels, as source information bits. In this paper we leverage the connections between this problem and well-studied codes with sparse representations for the channel coding problem to provide querying schemes with almost optimal number of queries, each of which involving only a constant number of labels. We also extend this scenario to the case where some workers can be unresponsive. For this case, we propose querying schemes where each query involves only log n items, where n is the total number of items to be labeled. Furthermore, we consider classification of two correlated labeling systems and provide two-stage querying schemes with almost optimal number of queries each involving a constant number of labels.