Siddharth Barman

h-index19

14papers

347citations

Novelty53%

AI Score49

Ranked #25,403 of 194,257 authors (top 13%)#6,075 in LG (top 15%)

14 Papers

14.1LGMay 27, 2022

Fairness and Welfare Quantification for Regret in Multi-Armed Bandits

Siddharth Barman, Arindam Khan, Arnab Maiti et al. · stanford

We extend the notion of regret with a welfarist perspective. Focussing on the classic multi-armed bandit (MAB) framework, the current work quantifies the performance of bandit algorithms by applying a fundamental welfare function, namely the Nash social welfare (NSW) function. This corresponds to equating algorithm's performance to the geometric mean of its expected rewards and leads us to the study of Nash regret, defined as the difference between the -- a priori unknown -- optimal mean (among the arms) and the algorithm's performance. Since NSW is known to satisfy fairness axioms, our approach complements the utilitarian considerations of average (cumulative) regret, wherein the algorithm is evaluated via the arithmetic mean of its expected rewards. This work develops an algorithm that, given the horizon of play $T$, achieves a Nash regret of $O \left( \sqrt{\frac{k \log T}{T}} \right)$, here $k$ denotes the number of arms in the MAB instance. Since, for any algorithm, the Nash regret is at least as much as its average regret (the AM-GM inequality), the known lower bound on average regret holds for Nash regret as well. Therefore, our Nash regret guarantee is essentially tight. In addition, we develop an anytime algorithm with a Nash regret guarantee of $O \left( \sqrt{\frac{k\log T}{T}} \log T \right)$.

10.7LGOct 3, 2023

Nash Regret Guarantees for Linear Bandits

Ayush Sawarni, Soumybrata Pal, Siddharth Barman · stanford

We obtain essentially tight upper bounds for a strengthened notion of regret in the stochastic linear bandits framework. The strengthening -- referred to as Nash regret -- is defined as the difference between the (a priori unknown) optimum and the geometric mean of expected rewards accumulated by the linear bandit algorithm. Since the geometric mean corresponds to the well-studied Nash social welfare (NSW) function, this formulation quantifies the performance of a bandit algorithm as the collective welfare it generates across rounds. NSW is known to satisfy fairness axioms and, hence, an upper bound on Nash regret provides a principled fairness guarantee. We consider the stochastic linear bandits problem over a horizon of $T$ rounds and with set of arms ${X}$ in ambient dimension $d$. Furthermore, we focus on settings in which the stochastic reward -- associated with each arm in ${X}$ -- is a non-negative, $ν$-sub-Poisson random variable. For this setting, we develop an algorithm that achieves a Nash regret of $O\left( \sqrt{\frac{dν}{T}} \log( T |X|)\right)$. In addition, addressing linear bandit instances in which the set of arms ${X}$ is not necessarily finite, we obtain a Nash regret upper bound of $O\left( \frac{d^\frac{5}{4}ν^{\frac{1}{2}}}{\sqrt{T}} \log(T)\right)$. Since bounded random variables are sub-Poisson, these results hold for bounded, positive rewards. Our linear bandit algorithm is built upon the successive elimination method with novel technical insights, including tailored concentration bounds and the use of sampling via John ellipsoid in conjunction with the Kiefer-Wolfowitz optimal design.

8.8DSMay 10

Equitable Colorings of Vertex-Weighted Graphs

Siddharth Barman, Vignesh Viswanathan

We study a generalization of the classical Hajnal-Szemerédi theorem to vertex-weighted graphs. Given a graph with nonnegative vertex weights, a coloring is called $α$-approximately equitable up to one vertex ($α$-EQ1) if, for each color class, the total weight remaining after removing its maximum-weight vertex is at most $α\geq 1$ times the weight of any other color class. For vertex-weighted graphs with maximum degree $Δ$, we show that there exist instances for which no $k$-coloring is $α$-EQ1 for any $k < \frac{3Δ}{2}$ and $α< \sqrt{2}$. In light of this impossibility, we relax these parameters and establish the following results for any vertex-weighted graph $G$ with maximum degree $Δ$: (1) for any $\varepsilon \in (0,1)$ and all $k \geq (\frac{c}{\varepsilon^2}\ln{\frac{1}{\varepsilon}}) Δ$, there exists a $(1 + \varepsilon)$-EQ1 $k$-coloring of $G$, where $c$ is a fixed constant; and (2) for all $k \ge Δ+ 1$, there exists a $2$-EQ1 $k$-coloring of $G$. Furthermore, such equitable colorings can be computed in polynomial time. En route to our results on equitability under vertex weights, we establish sufficient conditions for the existence of $k$-colorings that are equitable with respect to any given partition of the vertex set. Our coloring results correspond to fairness guarantees in a constrained fair division setting and lead to concentration inequalities for partly dependent random variables.

1.2DSFeb 9

Welfarist Formulations for Diverse Similarity Search

Siddharth Barman, Nirjhar Das, Shivam Gupta et al.

Nearest Neighbor Search (NNS) is a fundamental problem in data structures with wide-ranging applications, such as web search, recommendation systems, and, more recently, retrieval-augmented generations (RAG). In such recent applications, in addition to the relevance (similarity) of the returned neighbors, diversity among the neighbors is a central requirement. In this paper, we develop principled welfare-based formulations in NNS for realizing diversity across attributes. Our formulations are based on welfare functions -- from mathematical economics -- that satisfy central diversity (fairness) and relevance (economic efficiency) axioms. With a particular focus on Nash social welfare, we note that our welfare-based formulations provide objective functions that adaptively balance relevance and diversity in a query-dependent manner. Notably, such a balance was not present in the prior constraint-based approach, which forced a fixed level of diversity and optimized for relevance. In addition, our formulation provides a parametric way to control the trade-off between relevance and diversity, providing practitioners with flexibility to tailor search results to task-specific requirements. We develop efficient nearest neighbor algorithms with provable guarantees for the welfare-based objectives. Notably, our algorithm can be applied on top of any standard ANN method (i.e., use standard ANN method as a subroutine) to efficiently find neighbors that approximately maximize our welfare-based objectives. Experimental results demonstrate that our approach is practical and substantially improves diversity while maintaining high relevance of the retrieved neighbors.

20.3LGApr 10, 2024Code

Generalized Linear Bandits with Limited Adaptivity

Ayush Sawarni, Nirjhar Das, Siddharth Barman et al. · stanford

We study the generalized linear contextual bandit problem within the constraints of limited adaptivity. In this paper, we present two algorithms, $\texttt{B-GLinCB}$ and $\texttt{RS-GLinCB}$, that address, respectively, two prevalent limited adaptivity settings. Given a budget $M$ on the number of policy updates, in the first setting, the algorithm needs to decide upfront $M$ rounds at which it will update its policy, while in the second setting it can adaptively perform $M$ policy updates during its course. For the first setting, we design an algorithm $\texttt{B-GLinCB}$, that incurs $\tilde{O}(\sqrt{T})$ regret when $M = Ω( \log{\log T} )$ and the arm feature vectors are generated stochastically. For the second setting, we design an algorithm $\texttt{RS-GLinCB}$ that updates its policy $\tilde{O}(\log^2 T)$ times and achieves a regret of $\tilde{O}(\sqrt{T})$ even when the arm feature vectors are adversarially generated. Notably, in these bounds, we manage to eliminate the dependence on a key instance dependent parameter $κ$, that captures non-linearity of the underlying reward model. Our novel approach for removing this dependence for generalized linear contextual bandits might be of independent interest.

3.3GTOct 30, 2024

Fair Division with Market Values

Siddharth Barman, Soroush Ebadian, Mohamad Latifian et al. · utoronto

We introduce a model of fair division with market values, where indivisible goods must be partitioned among agents with (additive) subjective valuations, and each good additionally has a market value. The market valuation can be viewed as a separate additive valuation that holds identically across all the agents. We seek allocations that are simultaneously fair with respect to the subjective valuations and with respect to the market valuation. We show that an allocation that satisfies stochastically-dominant envy-freeness up to one good (SD-EF1) with respect to both the subjective valuations and the market valuation does not always exist, but the weaker guarantee of EF1 with respect to the subjective valuations along with SD-EF1 with respect to the market valuation can be guaranteed. We also study a number of other guarantees such as Pareto optimality, EFX, and MMS. In addition, we explore non-additive valuations and extend our model to cake-cutting. Along the way, we identify several tantalizing open questions.

8.8LGMay 8, 2023

Learning Good Interventions in Causal Graphs via Covering

Ayush Sawarni, Rahul Madhavan, Gaurav Sinha et al.

We study the causal bandit problem that entails identifying a near-optimal intervention from a specified set $A$ of (possibly non-atomic) interventions over a given causal graph. Here, an optimal intervention in ${A}$ is one that maximizes the expected value for a designated reward variable in the graph, and we use the standard notion of simple regret to quantify near optimality. Considering Bernoulli random variables and for causal graphs on $N$ vertices with constant in-degree, prior work has achieved a worst case guarantee of $\widetilde{O} (N/\sqrt{T})$ for simple regret. The current work utilizes the idea of covering interventions (which are not necessarily contained within ${A}$) and establishes a simple regret guarantee of $\widetilde{O}(\sqrt{N/T})$. Notably, and in contrast to prior work, our simple regret bound depends only on explicit parameters of the problem instance. We also go beyond prior work and achieve a simple regret guarantee for causal graphs with unobserved variables. Further, we perform experiments to show improvements over baselines in this setting.

3.1LGNov 1, 2021

Intervention Efficient Algorithm for Two-Stage Causal MDPs

Rahul Madhavan, Aurghya Maiti, Gaurav Sinha et al.

We study Markov Decision Processes (MDP) wherein states correspond to causal graphs that stochastically generate rewards. In this setup, the learner's goal is to identify atomic interventions that lead to high rewards by intervening on variables at each state. Generalizing the recent causal-bandit framework, the current work develops (simple) regret minimization guarantees for two-stage causal MDPs, with parallel causal graph at each state. We propose an algorithm that achieves an instance dependent regret bound. A key feature of our algorithm is that it utilizes convex optimization to address the exploration problem. We identify classes of instances wherein our regret guarantee is essentially tight, and experimentally validate our theoretical results.

1.6LGMay 4, 2021

Optimal Algorithms for Range Searching over Multi-Armed Bandits

Siddharth Barman, Ramakrishnan Krishnamurthy, Saladi Rahul

This paper studies a multi-armed bandit (MAB) version of the range-searching problem. In its basic form, range searching considers as input a set of points (on the real line) and a collection of (real) intervals. Here, with each specified point, we have an associated weight, and the problem objective is to find a maximum-weight point within every given interval. The current work addresses range searching with stochastic weights: each point corresponds to an arm (that admits sample access) and the point's weight is the (unknown) mean of the underlying distribution. In this MAB setup, we develop sample-efficient algorithms that find, with high probability, near-optimal arms within the given intervals, i.e., we obtain PAC (probably approximately correct) guarantees. We also provide an algorithm for a generalization wherein the weight of each point is a multi-dimensional vector. The sample complexities of our algorithms depend, in particular, on the size of the optimal hitting set of the given intervals. Finally, we establish lower bounds proving that the obtained sample complexities are essentially tight. Our results highlight the significance of geometric constructs -- specifically, hitting sets -- in our MAB setting.

7.5AISep 3, 2019

Quantifying Infra-Marginality and Its Trade-off with Group Fairness

Arpita Biswas, Siddharth Barman, Amit Deshpande et al.

In critical decision-making scenarios, optimizing accuracy can lead to a biased classifier, hence past work recommends enforcing group-based fairness metrics in addition to maximizing accuracy. However, doing so exposes the classifier to another kind of bias called infra-marginality. This refers to individual-level bias where some individuals/subgroups can be worse off than under simply optimizing for accuracy. For instance, a classifier implementing race-based parity may significantly disadvantage women of the advantaged race. To quantify this bias, we propose a general notion of $η$-infra-marginality that can be used to evaluate the extent of this bias. We prove theoretically that, unlike other fairness metrics, infra-marginality does not have a trade-off with accuracy: high accuracy directly leads to low infra-marginality. This observation is confirmed through empirical analysis on multiple simulated and real-world datasets. Further, we find that maximizing group fairness often increases infra-marginality, suggesting the consideration of both group-level fairness and individual-level infra-marginality. However, measuring infra-marginality requires knowledge of the true distribution of individual-level outcomes correctly and explicitly. We propose a practical method to measure infra-marginality, and a simple algorithm to maximize group-wise accuracy and avoid infra-marginality.

13.4GTApr 25, 2018

Fair Division Under Cardinality Constraints

Siddharth Barman, Arpita Biswas

We consider the problem of fairly allocating indivisible goods, among agents, under cardinality constraints and additive valuations. In this setting, we are given a partition of the entire set of goods---i.e., the goods are categorized---and a limit is specified on the number of goods that can be allocated from each category to any agent. The objective here is to find a fair allocation in which the subset of goods assigned to any agent satisfies the given cardinality constraints. This problem naturally captures a number of resource-allocation applications, and is a generalization of the well-studied (unconstrained) fair division problem. The two central notions of fairness, in the context of fair division of indivisible goods, are envy freeness up to one good (EF1) and the (approximate) maximin share guarantee (MMS). We show that the existence and algorithmic guarantees established for these solution concepts in the unconstrained setting can essentially be achieved under cardinality constraints. Specifically, we develop efficient algorithms which compute EF1 and approximately MMS allocations in the constrained setting. Furthermore, focusing on the case wherein all the agents have the same additive valuation, we establish that EF1 allocations exist and can be computed efficiently even under laminar matroid constraints.

10.3GTNov 21, 2017

Groupwise Maximin Fair Allocation of Indivisible Goods

Siddharth Barman, Arpita Biswas, Sanath Kumar Krishnamurthy et al.

We study the problem of allocating indivisible goods among n agents in a fair manner. For this problem, maximin share (MMS) is a well-studied solution concept which provides a fairness threshold. Specifically, maximin share is defined as the minimum utility that an agent can guarantee for herself when asked to partition the set of goods into n bundles such that the remaining (n-1) agents pick their bundles adversarially. An allocation is deemed to be fair if every agent gets a bundle whose valuation is at least her maximin share. Even though maximin shares provide a natural benchmark for fairness, it has its own drawbacks and, in particular, it is not sufficient to rule out unsatisfactory allocations. Motivated by these considerations, in this work we define a stronger notion of fairness, called groupwise maximin share guarantee (GMMS). In GMMS, we require that the maximin share guarantee is achieved not just with respect to the grand bundle, but also among all the subgroups of agents. Hence, this solution concept strengthens MMS and provides an ex-post fairness guarantee. We show that in specific settings, GMMS allocations always exist. We also establish the existence of approximate GMMS allocations under additive valuations, and develop a polynomial-time algorithm to find such allocations. Moreover, we establish a scale of fairness wherein we show that GMMS implies approximate envy freeness. Finally, we empirically demonstrate the existence of GMMS allocations in a large set of randomly generated instances. For the same set of instances, we additionally show that our algorithm achieves an approximation factor better than the established, worst-case bound.

2.0LGJun 13, 2017

Online Learning for Structured Loss Spaces

Siddharth Barman, Aditya Gopalan, Aadirupa Saha

We consider prediction with expert advice when the loss vectors are assumed to lie in a set described by the sum of atomic norm balls. We derive a regret bound for a general version of the online mirror descent (OMD) algorithm that uses a combination of regularizers, each adapted to the constituent atomic norms. The general result recovers standard OMD regret bounds, and yields regret bounds for new structured settings where the loss vectors are (i) noisy versions of points from a low-rank subspace, (ii) sparse vectors corrupted with noise, and (iii) sparse perturbations of low-rank vectors. For the problem of online learning with structured losses, we also show lower bounds on regret in terms of rank and sparsity of the source set of the loss vectors, which implies lower bounds for the above additive loss settings as well.

14.6LGApr 25, 2015

Online Convex Optimization Using Predictions

Niangjun Chen, Anish Agarwal, Adam Wierman et al.

Making use of predictions is a crucial, but under-explored, area of online algorithms. This paper studies a class of online optimization problems where we have external noisy predictions available. We propose a stochastic prediction error model that generalizes prior models in the learning and stochastic control communities, incorporates correlation among prediction errors, and captures the fact that predictions improve as time passes. We prove that achieving sublinear regret and constant competitive ratio for online algorithms requires the use of an unbounded prediction window in adversarial settings, but that under more realistic stochastic prediction error models it is possible to use Averaging Fixed Horizon Control (AFHC) to simultaneously achieve sublinear regret and constant competitive ratio in expectation using only a constant-sized prediction window. Furthermore, we show that the performance of AFHC is tightly concentrated around its mean.