LGJul 7, 2022
Differentially Private Stochastic Linear Bandits: (Almost) for FreeOsama A. Hanna, Antonious M. Girgis, Christina Fragouli et al. · deepmind
In this paper, we propose differentially private algorithms for the problem of stochastic linear bandits in the central, local and shuffled models. In the central model, we achieve almost the same regret as the optimal non-private algorithms, which means we get privacy for free. In particular, we achieve a regret of $\tilde{O}(\sqrt{T}+\frac{1}ε)$ matching the known lower bound for private linear bandits, while the best previously known algorithm achieves $\tilde{O}(\frac{1}ε\sqrt{T})$. In the local case, we achieve a regret of $\tilde{O}(\frac{1}ε{\sqrt{T}})$ which matches the non-private regret for constant $ε$, but suffers a regret penalty when $ε$ is small. In the shuffled model, we also achieve regret of $\tilde{O}(\sqrt{T}+\frac{1}ε)$ %for small $ε$ as in the central case, while the best previously known algorithm suffers a regret of $\tilde{O}(\frac{1}ε{T^{3/5}})$. Our numerical evaluation validates our theoretical results.
MLNov 8, 2022
Contexts can be Cheap: Solving Stochastic Contextual Bandits with Linear Bandit AlgorithmsOsama A. Hanna, Lin F. Yang, Christina Fragouli
In this paper, we address the stochastic contextual linear bandit problem, where a decision maker is provided a context (a random set of actions drawn from a distribution). The expected reward of each action is specified by the inner product of the action and an unknown parameter. The goal is to design an algorithm that learns to play as close as possible to the unknown optimal policy after a number of action plays. This problem is considered more challenging than the linear bandit problem, which can be viewed as a contextual bandit problem with a \emph{fixed} context. Surprisingly, in this paper, we show that the stochastic contextual problem can be solved as if it is a linear bandit problem. In particular, we establish a novel reduction framework that converts every stochastic contextual linear bandit instance to a linear bandit instance, when the context distribution is known. When the context distribution is unknown, we establish an algorithm that reduces the stochastic contextual instance to a sequence of linear bandit instances with small misspecifications and achieves nearly the same worst-case regret bound as the algorithm that solves the misspecified linear bandit instances. As a consequence, our results imply a $O(d\sqrt{T\log T})$ high-probability regret bound for contextual linear bandits, making progress in resolving an open problem in (Li et al., 2019), (Li et al., 2021). Our reduction framework opens up a new way to approach stochastic contextual linear bandit problems, and enables improved regret bounds in a number of instances including the batch setting, contextual bandits with misspecifications, contextual bandits with sparse unknown parameters, and contextual bandits with adversarial corruption.
ITNov 17, 2022
Proactive Resilient Transmission and Scheduling Mechanisms for mmWave NetworksMine Gokce Dogan, Martina Cardone, Christina Fragouli
This paper aims to develop resilient transmission mechanisms to suitably distribute traffic across multiple paths in an arbitrary millimeter-wave (mmWave) network. The main contributions include: (a) the development of proactive transmission mechanisms that build resilience against network disruptions in advance, while achieving a high end-to-end packet rate; (b) the design of a heuristic path selection algorithm that efficiently selects (in polynomial time in the network size) multiple proactively resilient paths with high packet rates; and (c) the development of a hybrid scheduling algorithm that combines the proposed path selection algorithm with a deep reinforcement learning (DRL) based online approach for decentralized adaptation to blocked links and failed paths. To achieve resilience to link failures, a state-of-the-art Soft Actor-Critic DRL algorithm, which adapts the information flow through the network, is investigated. The proposed scheduling algorithm robustly adapts to link failures over different topologies, channel and blockage realizations while offering a superior performance to alternative algorithms.
LGJun 8, 2022
Learning in Distributed Contextual Linear Bandits Without Sharing the ContextOsama A. Hanna, Lin F. Yang, Christina Fragouli
Contextual linear bandits is a rich and theoretically important model that has many practical applications. Recently, this setup gained a lot of interest in applications over wireless where communication constraints can be a performance bottleneck, especially when the contexts come from a large $d$-dimensional space. In this paper, we consider a distributed memoryless contextual linear bandit learning problem, where the agents who observe the contexts and take actions are geographically separated from the learner who performs the learning while not seeing the contexts. We assume that contexts are generated from a distribution and propose a method that uses $\approx 5d$ bits per context for the case of unknown context distribution and $0$ bits per context if the context distribution is known, while achieving nearly the same regret bound as if the contexts were directly observable. The former bound improves upon existing bounds by a $\log(T)$ factor, where $T$ is the length of the horizon, while the latter achieves information theoretical tightness.
LGFeb 6
ScaleBITS: Scalable Bitwidth Search for Hardware-Aligned Mixed-Precision LLMsXinlin Li, Timothy Chou, Josh Fromm et al.
Post-training weight quantization is crucial for reducing the memory and inference cost of large language models (LLMs), yet pushing the average precision below 4 bits remains challenging due to highly non-uniform weight sensitivity and the lack of principled precision allocation. Existing solutions use irregular fine-grained mixed-precision with high runtime overhead or rely on heuristics or highly constrained precision allocation strategies. In this work, we propose ScaleBITS, a mixed-precision quantization framework that enables automated, fine-grained bitwidth allocation under a memory budget while preserving hardware efficiency. Guided by a new sensitivity analysis, we introduce a hardware-aligned, block-wise weight partitioning scheme, powered by bi-directional channel reordering. We formulate global bitwidth allocation as a constrained optimization problem and develop a scalable approximation to the greedy algorithm, enabling end-to-end principled allocation. Experiments show that ScaleBITS significantly improves over uniform-precision quantization (up to +36%) and outperforms state-of-the-art sensitivity-aware baselines (up to +13%) in ultra-low-bit regime, without adding runtime overhead.
LGMay 1, 2025
ICQuant: Index Coding enables Low-bit LLM QuantizationXinlin Li, Osama Hanna, Christina Fragouli et al.
The rapid deployment of Large Language Models (LLMs) highlights the need for efficient low-bit post-training quantization (PTQ), due to their high memory costs. A key challenge in weight quantization is the presence of outliers, which inflate quantization ranges and lead to large errors. While a number of outlier suppression techniques have been proposed, they either: fail to effectively shrink the quantization range, or incur (relatively) high bit overhead. In this paper, we present ICQuant, a novel framework that leverages outlier statistics to design an efficient index coding scheme for outlier-aware weight-only quantization. Compared to existing outlier suppression techniques requiring $\approx 1$ bit overhead to halve the quantization range, ICQuant requires only $\approx 0.3$ bits; a significant saving in extreme compression regimes (e.g., 2-3 bits per weight). ICQuant can be used on top of any existing quantizers to eliminate outliers, improving the quantization quality. Using just 2.3 bits per weight and simple scalar quantizers, ICQuant improves the zero-shot accuracy of the 2-bit Llama3-70B model by up to 130% and 150% relative to QTIP and QuIP#; and it achieves comparable performance to the best-known fine-tuned quantizer (PV-tuning) without fine-tuning.
LGDec 21, 2023
Multi-Agent Bandit Learning through Heterogeneous Action Erasure ChannelsOsama A. Hanna, Merve Karakas, Lin F. Yang et al.
Multi-Armed Bandit (MAB) systems are witnessing an upswing in applications within multi-agent distributed environments, leading to the advancement of collaborative MAB algorithms. In such settings, communication between agents executing actions and the primary learner making decisions can hinder the learning process. A prevalent challenge in distributed learning is action erasure, often induced by communication delays and/or channel noise. This results in agents possibly not receiving the intended action from the learner, subsequently leading to misguided feedback. In this paper, we introduce novel algorithms that enable learners to interact concurrently with distributed agents across heterogeneous action erasure channels with different action erasure probabilities. We illustrate that, in contrast to existing bandit algorithms, which experience linear regret, our algorithms assure sub-linear regret guarantees. Our proposed solutions are founded on a meticulously crafted repetition protocol and scheduling of learning across heterogeneous channels. To our knowledge, these are the first algorithms capable of effectively learning through heterogeneous action erasure channels. We substantiate the superior performance of our algorithm through numerical experiments, emphasizing their practical significance in addressing issues related to communication constraints and delays in multi-agent environments.
15.4ITApr 8
Top-P Sensor Selection for Target LocalizationKaan Buyukkalayci, Kyle Pak, Merve Karakas et al.
We study set-valued decision rules in which performance is defined by the inclusion of the top-$p$ hypotheses, rather than only the single best or true hypothesis. This criterion is motivated by sensor selection for target tracking, where inexpensive measurements are used to identify a list of sensor nodes that are likely to be closest to a target. We analyze the performance of top-$p$ versus top-$1$ selection under sequential hypothesis testing, propose a geometry-aware sensor selection algorithm, and validate the approach using real testbed data.
41.9ITApr 2
Best-Arm Identification with Noisy ActuationMerve Karakas, Osama Hanna, Lin F. Yang et al.
In this paper, we consider a multi-armed bandit (MAB) instance and study how to identify the best arm when arm commands are conveyed from a central learner to a distributed agent over a discrete memoryless channel (DMC). Depending on the agent capabilities, we provide communication schemes along with their analysis, which interestingly relate to the zero-error capacity of the underlying DMC.
13.4LGMar 13
A Reduction Algorithm for Markovian Contextual Linear BanditsKaan Buyukkalayci, Osama Hanna, Christina Fragouli
Recent work shows that when contexts are drawn i.i.d., linear contextual bandits can be reduced to single-context linear bandits. This ``contexts are cheap" perspective is highly advantageous, as it allows for sharper finite-time analyses and leverages mature techniques from the linear bandit literature, such as those for misspecification and adversarial corruption. Motivated by applications with temporally correlated availability, we extend this perspective to Markovian contextual linear bandits, where the action set evolves via an exogenous Markov chain. Our main contribution is a reduction that applies under uniform geometric ergodicity. We construct a stationary surrogate action set to solve the problem using a standard linear bandit oracle, employing a delayed-update scheme to control the bias induced by the nonstationary conditional context distributions. We further provide a phased algorithm for unknown transition distributions that learns the surrogate mapping online. In both settings, we obtain a high-probability worst-case regret bound matching that of the underlying linear bandit oracle, with only lower-order dependence on the mixing time.
LGApr 29, 2025
Does Feedback Help in Bandits with Arm Erasures?Merve Karakas, Osama Hanna, Lin F. Yang et al.
We study a distributed multi-armed bandit (MAB) problem over arm erasure channels, motivated by the increasing adoption of MAB algorithms over communication-constrained networks. In this setup, the learner communicates the chosen arm to play to an agent over an erasure channel with probability $ε\in [0,1)$; if an erasure occurs, the agent continues pulling the last successfully received arm; the learner always observes the reward of the arm pulled. In past work, we considered the case where the agent cannot convey feedback to the learner, and thus the learner does not know whether the arm played is the requested or the last successfully received one. In this paper, we instead consider the case where the agent can send feedback to the learner on whether the arm request was received, and thus the learner exactly knows which arm was played. Surprisingly, we prove that erasure feedback does not improve the worst-case regret upper bound order over the previously studied no-feedback setting. In particular, we prove a regret lower bound of $Ω(\sqrt{KT} + K / (1 - ε))$, where $K$ is the number of arms and $T$ the time horizon, that matches no-feedback upper bounds up to logarithmic factors. We note however that the availability of feedback enables simpler algorithm designs that may achieve better constants (albeit not better order) regret bounds; we design one such algorithm and evaluate its performance numerically.
MLJun 26, 2024
Learning for Bandits under Action ErasuresOsama Hanna, Merve Karakas, Lin F. Yang et al.
We consider a novel multi-arm bandit (MAB) setup, where a learner needs to communicate the actions to distributed agents over erasure channels, while the rewards for the actions are directly available to the learner through external sensors. In our model, while the distributed agents know if an action is erased, the central learner does not (there is no feedback), and thus does not know whether the observed reward resulted from the desired action or not. We propose a scheme that can work on top of any (existing or future) MAB algorithm and make it robust to action erasures. Our scheme results in a worst-case regret over action-erasure channels that is at most a factor of $O(1/\sqrt{1-ε})$ away from the no-erasure worst-case regret of the underlying MAB algorithm, where $ε$ is the erasure probability. We also propose a modification of the successive arm elimination algorithm and prove that its worst-case regret is $\Tilde{O}(\sqrt{KT}+K/(1-ε))$, which we prove is optimal by providing a matching lower bound.
LGNov 11, 2021
Solving Multi-Arm Bandit Using a Few Bits of CommunicationOsama A. Hanna, Lin F. Yang, Christina Fragouli
The multi-armed bandit (MAB) problem is an active learning framework that aims to select the best among a set of actions by sequentially observing rewards. Recently, it has become popular for a number of applications over wireless networks, where communication constraints can form a bottleneck. Existing works usually fail to address this issue and can become infeasible in certain applications. In this paper we address the communication problem by optimizing the communication of rewards collected by distributed agents. By providing nearly matching upper and lower bounds, we tightly characterize the number of bits needed per reward for the learner to accurately learn without suffering additional regret. In particular, we establish a generic reward quantization algorithm, QuBan, that can be applied on top of any (no-regret) MAB algorithm to form a new communication-efficient counterpart, that requires only a few (as low as 3) bits to be sent per iteration while preserving the same regret bound. Our lower bound is established via constructing hard instances from a subgaussian distribution. Our theory is further corroborated by numerically experiments.
ITAug 1, 2021
A Reinforcement Learning Approach for Scheduling in mmWave NetworksMine Gokce Dogan, Yahya H. Ezzeldin, Christina Fragouli et al.
We consider a source that wishes to communicate with a destination at a desired rate, over a mmWave network where links are subject to blockage and nodes to failure (e.g., in a hostile military environment). To achieve resilience to link and node failures, we here explore a state-of-the-art Soft Actor-Critic (SAC) deep reinforcement learning algorithm, that adapts the information flow through the network, without using knowledge of the link capacities or network topology. Numerical evaluations show that our algorithm can achieve the desired rate even in dynamic environments and it is robust against blockage.
LGDec 14, 2020
Quantizing data for distributed learningOsama A. Hanna, Yahya H. Ezzeldin, Christina Fragouli et al.
We consider machine learning applications that train a model by leveraging data distributed over a trusted network, where communication constraints can create a performance bottleneck. A number of recent approaches propose to overcome this bottleneck through compression of gradient updates. However, as models become larger, so does the size of the gradient updates. In this paper, we propose an alternate approach to learn from distributed data that quantizes data instead of gradients, and can support learning over applications where the size of gradient updates is prohibitive. Our approach leverages the dependency of the computed gradient on data samples, which lie in a much smaller space in order to perform the quantization in the smaller dimension data space. At the cost of an extra gradient computation, the gradient estimate can be refined by conveying the difference between the gradient at the quantized data point and the original gradient using a small number of bits. Lastly, in order to save communication, our approach adds a layer that decides whether to transmit a quantized data sample or not based on its importance for learning. We analyze the convergence of the proposed approach for smooth convex and non-convex objective functions and show that we can achieve order optimal convergence rates with communication that mostly depends on the data rather than the model (gradient) dimension. We use our proposed algorithm to train ResNet models on the CIFAR-10 and ImageNet datasets, and show that we can achieve an order of magnitude savings over gradient compression methods. These communication savings come at the cost of increasing computation at the learning agent, and thus our approach is beneficial in scenarios where communication load is the main problem.
ITJun 25, 2020
Distortion based Light-weight Security for Cyber-Physical SystemsGaurav Kumar Agarwal, Mohammed Karmoose, Suhas Diggavi et al.
In Cyber-Physical Systems (CPS), inference based on communicated data is of critical significance as it can be used to manipulate or damage the control operations by adversaries. This calls for efficient mechanisms for secure transmission of data since control systems are becoming increasingly distributed over larger geographical areas. Distortion based security, recently proposed as one candidate for secure transmissions in CPS, is not only more appropriate for these applications but also quite frugal in terms of prior requirements on shared keys. In this paper, we propose distortion-based metrics to protect CPS communication and show that it is possible to confuse adversaries with just a few bits of pre-shared keys. In particular, we will show that a linear dynamical system can communicate its state in a manner that prevents an eavesdropper from accurately learning the state.
CRMay 24, 2020
Successive Refinement of PrivacyAntonious M. Girgis, Deepesh Data, Kamalika Chaudhuri et al.
This work examines a novel question: how much randomness is needed to achieve local differential privacy (LDP)? A motivating scenario is providing {\em multiple levels of privacy} to multiple analysts, either for distribution or for heavy-hitter estimation, using the \emph{same} (randomized) output. We call this setting \emph{successive refinement of privacy}, as it provides hierarchical access to the raw data with different privacy levels. For example, the same randomized output could enable one analyst to reconstruct the input, while another can only estimate the distribution subject to LDP requirements. This extends the classical Shannon (wiretap) security setting to local differential privacy. We provide (order-wise) tight characterizations of privacy-utility-randomness trade-offs in several cases for distribution estimation, including the standard LDP setting under a randomness constraint. We also provide a non-trivial privacy mechanism for multi-level privacy. Furthermore, we show that we cannot reuse random keys over time while preserving privacy of each user.
LGMay 14, 2020
Federated Recommendation System via Differential PrivacyTan Li, Linqi Song, Christina Fragouli
In this paper, we are interested in what we term the federated private bandits framework, that combines differential privacy with multi-agent bandit learning. We explore how differential privacy based Upper Confidence Bound (UCB) methods can be applied to multi-agent environments, and in particular to federated learning environments both in `master-worker' and `fully decentralized' settings. We provide a theoretical analysis on the privacy and regret performance of the proposed methods and explore the tradeoffs between these two.
LGNov 1, 2019
On Distributed Quantization for ClassificationOsama A. Hanna, Yahya H. Ezzeldin, Tara Sadjadpour et al.
We consider the problem of distributed feature quantization, where the goal is to enable a pretrained classifier at a central node to carry out its classification on features that are gathered from distributed nodes through communication constrained channels. We propose the design of distributed quantization schemes specifically tailored to the classification task: unlike quantization schemes that help the central node reconstruct the original signal as accurately as possible, our focus is not reconstruction accuracy, but instead correct classification. Our work does not make any apriori distributional assumptions on the data, but instead uses training data for the quantizer design. Our main contributions include: we prove NP-hardness of finding optimal quantizers in the general case; we design an optimal scheme for a special case; we propose quantization algorithms, that leverage discrete neural representations and training data, and can be designed in polynomial-time for any number of features, any number of classes, and arbitrary division of features across the distributed nodes. We find that tailoring the quantizers to the classification task can offer significant savings: as compared to alternatives, we can achieve more than a factor of two reduction in terms of the number of bits communicated, for the same classification accuracy.
IROct 15, 2018
Regret vs. Bandwidth Trade-off for Recommendation SystemsLinqi Song, Christina Fragouli, Devavrat Shah
We consider recommendation systems that need to operate under wireless bandwidth constraints, measured as number of broadcast transmissions, and demonstrate a (tight for some instances) tradeoff between regret and bandwidth for two scenarios: the case of multi-armed bandit with context, and the case where there is a latent structure in the message space that we can exploit to reduce the learning phase.
CRMar 22, 2018
Using mm-Waves for Secret Key EstablishmentMohammed Karmoose, Christina Fragouli, Suhas Diggavi et al.
The fact that Millimeter Wave (mmWave) communication needs to be directional is usually perceived as a challenge; in this paper we argue that it enables efficient secret key sharing that are unconditionally secure from passive eavesdroppers, by building on packet erasures. We showcase the potential of our approach in two setups: mmWave-based WiFi networks and vehicle platooning. We show that in the first case, we can establish a few hundred secret bits with minimal changes to standard communication protocol; while in both cases, with the right choice of parameters, we can potentially establish keys in the order of tenths of Mbps. These first results are based on some simplifying assumptions, yet we believe they give incentives to further explore such techniques.
CRApr 8, 2016
Group secret key agreement over state-dependent wireless broadcast channelsMahdi Jafari Siavoshani, Shaunak Mishra, Christina Fragouli et al.
We consider a group of $m$ trusted and authenticated nodes that aim to create a shared secret key $K$ over a wireless channel in the presence of an eavesdropper Eve. We assume that there exists a state dependent wireless broadcast channel from one of the honest nodes to the rest of them including Eve. All of the trusted nodes can also discuss over a cost-free, noiseless and unlimited rate public channel which is also overheard by Eve. For this setup, we develop an information-theoretically secure secret key agreement protocol. We show the optimality of this protocol for "linear deterministic" wireless broadcast channels. This model generalizes the packet erasure model studied in literature for wireless broadcast channels. For "state-dependent Gaussian" wireless broadcast channels, we propose an achievability scheme based on a multi-layer wiretap code. Finding the best achievable secret key generation rate leads to solving a non-convex power allocation problem. We show that using a dynamic programming algorithm, one can obtain the best power allocation for this problem. Moreover, we prove the optimality of the proposed achievability scheme for the regime of high-SNR and large-dynamic range over the channel states in the (generalized) degrees of freedom sense.
NIMay 14, 2014
MicroCast: Cooperative Video Streaming using Cellular and D2D ConnectionsAnh Le, Lorenzo Keller, Hulya Seferoglu et al.
We consider a group of mobile users, within proximity of each other, who are interested in watching the same online video at roughly the same time. The common practice today is that each user downloads the video independently on her mobile device using her own cellular connection, which wastes access bandwidth and may also lead to poor video quality. We propose a novel cooperative system where each mobile device uses simultaneously two network interfaces: (i) the cellular to connect to the video server and download parts of the video and (ii) WiFi to connect locally to all other devices in the group and exchange those parts. Devices cooperate to efficiently utilize all network resources and are able to adapt to varying wireless network conditions. In the local WiFi network, we exploit overhearing, and we further combine it with network coding. The end result is savings in cellular bandwidth and improved user experience (faster download) by a factor on the order up to the group size. We follow a complete approach, from theory to practice. First, we formulate the problem using a network utility maximization (NUM) framework, decompose the problem, and provide a distributed solution. Then, based on the structure of the NUM solution, we design a modular system called MicroCast and we implement it as an Android application. We provide both simulation results of the NUM solution and experimental evaluation of MicroCast on a testbed consisting of Android phones. We demonstrate that the proposed approach brings significant performance benefits without battery penalty.
ITMay 14, 2013
Using Feedback for Secrecy over GraphsShaunak Mishra, Christina Fragouli, Vinod Prabhakaran et al.
We study the problem of secure message multicasting over graphs in the presence of a passive (node) adversary who tries to eavesdrop in the network. We show that use of feedback, facilitated through the existence of cycles or undirected edges, enables higher rates than possible in directed acyclic graphs of the same mincut. We demonstrate this using code constructions for canonical combination networks (CCNs). We also provide general outer bounds as well as schemes for node adversaries over CCNs.
ITJun 14, 2012
Multi-terminal Secrecy in a Linear Non-coherent Packetized NetworksMahdi Jafari Siavoshani, Christina Fragouli
We consider a group of m+1 trusted nodes that aim to create a shared secret key K over a network in the presence of a passive eavesdropper, Eve. We assume a linear non-coherent network coding broadcast channel (over a finite field F_q) from one of the honest nodes (i.e., Alice) to the rest of them including Eve. All of the trusted nodes can also discuss over a cost-free public channel which is also overheard by Eve. For this setup, we propose upper and lower bounds for the secret key generation capacity assuming that the field size q is very large. For the case of two trusted terminals (m = 1) our upper and lower bounds match and we have complete characterization for the secrecy capacity in the large field size regime.