LGJan 31, 2025
Byzantine-Resilient Zero-Order Optimization for Communication-Efficient Heterogeneous Federated LearningMaximilian Egger, Mayank Bakshi, Rawad Bitar
We introduce CyBeR-0, a Byzantine-resilient federated zero-order optimization method that is robust under Byzantine attacks and provides significant savings in uplink and downlink communication costs. We introduce transformed robust aggregation to give convergence guarantees for general non-convex objectives under client data heterogeneity. Empirical evaluations for standard learning tasks and fine-tuning large language models show that CyBeR-0 exhibits stable performance with only a few scalars per-round communication cost and reduced memory requirements.
LGJun 20, 2024
Communication-Efficient Byzantine-Resilient Federated Zero-Order OptimizationAfonso de Sá Delgado Neto, Maximilian Egger, Mayank Bakshi et al.
We introduce CYBER-0, the first zero-order optimization algorithm for memory-and-communication efficient Federated Learning, resilient to Byzantine faults. We show through extensive numerical experiments on the MNIST dataset and finetuning RoBERTa-Large that CYBER-0 outperforms state-of-the-art algorithms in terms of communication and memory efficiency while reaching similar accuracy. We provide theoretical guarantees on its convergence for convex loss functions.
LGMay 12, 2024
VALID: a Validated Algorithm for Learning in Decentralized Networks with Possible Adversarial PresenceMayank Bakshi, Sara Ghasvarianjahromi, Yauhen Yakimenka et al.
We introduce the paradigm of validated decentralized learning for undirected networks with heterogeneous data and possible adversarial infiltration. We require (a) convergence to a global empirical loss minimizer when adversaries are absent, and (b) either detection of adversarial presence of convergence to an admissible consensus irrespective of the adversarial configuration. To this end, we propose the VALID protocol which, to the best of our knowledge, is the first to achieve a validated learning guarantee. Moreover, VALID offers an O(1/T) convergence rate (under pertinent regularity assumptions), and computational and communication complexities comparable to non-adversarial distributed stochastic gradient descent. Remarkably, VALID retains optimal performance metrics in adversary-free environments, sidestepping the robustness penalties observed in prior byzantine-robust methods. A distinctive aspect of our study is a heterogeneity metric based on the norms of individual agents' gradients computed at the global empirical loss minimizer. This not only provides a natural statistic for detecting significant byzantine disruptions but also allows us to prove the optimality of VALID in wide generality. Lastly, our numerical results reveal that, in the absence of adversaries, VALID converges faster than state-of-the-art byzantine robust algorithms, while when adversaries are present, VALID terminates with each honest either converging to an admissible consensus of declaring adversarial presence in the network.
ITJan 25, 2016
Plausible Deniability over Broadcast ChannelsMayank Bakshi, Vinod Prabhakaran
In this paper, we introduce the notion of Plausible Deniability in an information theoretic framework. We consider a scenario where an entity that eavesdrops through a broadcast channel summons one of the parties in a communication protocol to reveal their message (or signal vector). It is desirable that the summoned party have enough freedom to produce a fake output that is likely plausible given the eavesdropper's observation. We examine three variants of this problem -- Message Deniability, Transmitter Deniability, and Receiver Deniability. In the first setting, the message sender is summoned to produce the sent message. Similarly, in the second and third settings, the transmitter and the receiver are required to produce the transmitted codeword, and the received vector respectively. For each of these settings, we examine the maximum communication rate that allows a given minimum rate of plausible fake outputs. For the Message and Transmitter Deniability problems, we fully characterise the capacity region for general broadcast channels, while for the Receiver Deniability problem, we give an achievable rate region for physically degraded broadcast channels.