Anupam Gupta

h-index24

17papers

420citations

Novelty52%

AI Score57

Ranked #18,011 of 205,806 authors (top 9%)#4,016 in LG (top 9%)

17 Papers

DSNov 7, 2011

Near Linear-Work Parallel SDD Solvers, Low-Diameter Decomposition, and Low-Stretch Subgraphs

Guy E. Blelloch, Anupam Gupta, Ioannis Koutis et al.

We present the design and analysis of a near linear-work parallel algorithm for solving symmetric diagonally dominant (SDD) linear systems. On input of a SDD $n$-by-$n$ matrix $A$ with $m$ non-zero entries and a vector $b$, our algorithm computes a vector $\tilde{x}$ such that $\norm[A]{\tilde{x} - A^+b} \leq \vareps \cdot \norm[A]{A^+b}$ in $O(m\log^{O(1)}{n}\log{\frac1ε})$ work and $O(m^{1/3+θ}\log \frac1ε)$ depth for any fixed $θ> 0$. The algorithm relies on a parallel algorithm for generating low-stretch spanning trees or spanning subgraphs. To this end, we first develop a parallel decomposition algorithm that in polylogarithmic depth and $\otilde(|E|)$ work, partitions a graph into components with polylogarithmic diameter such that only a small fraction of the original edges are between the components. This can be used to generate low-stretch spanning trees with average stretch $O(n^α)$ in $O(n^{1+α})$ work and $O(n^α)$ depth. Alternatively, it can be used to generate spanning subgraphs with polylogarithmic average stretch in $\otilde(|E|)$ work and polylogarithmic depth. We apply this subgraph construction to derive a parallel linear system solver. By using this solver in known applications, our results imply improved parallel randomized algorithms for several problems, including single-source shortest paths, maximum flow, minimum-cost flow, and approximate maximum flow.

LGOct 1, 2023

Improving Length-Generalization in Transformers via Task Hinting

Pranjal Awasthi, Anupam Gupta

It has been observed in recent years that transformers have problems with length generalization for certain types of reasoning and arithmetic tasks. In particular, the performance of a transformer model trained on tasks (say addition) up to a certain length (e.g., 5 digit numbers) drops sharply when applied to longer instances of the same problem. This work proposes an approach based on task hinting towards addressing length generalization. Our key idea is that while training the model on task-specific data, it is helpful to simultaneously train the model to solve a simpler but related auxiliary task as well. We study the classical sorting problem as a canonical example to evaluate our approach. We design a multitask training framework and show that task hinting significantly improve length generalization. For sorting we show that it is possible to train models on data consisting of sequences having length at most $20$, and improve the test accuracy on sequences of length $100$ from less than 1% (for standard training) to more than 92% (via task hinting). Our study uncovers several interesting aspects of length generalization. We observe that while several auxiliary tasks may seem natural a priori, their effectiveness in improving length generalization differs dramatically. We further use probing and visualization-based techniques to understand the internal mechanisms via which the model performs the task, and propose a theoretical construction consistent with the observed learning behaviors of the model. Based on our construction, we show that introducing a small number of length dependent parameters into the training procedure can further boost the performance on unseen lengths. Finally, we also show the efficacy of our task hinting based approach beyond sorting, giving hope that these techniques will be applicable in broader contexts.

GTFeb 13

Contextual Online Bilateral Trade

Romain Cosson, Federico Fusco, Anupam Gupta et al.

We study repeated bilateral trade when the valuations of the sellers and the buyers are contextual. More precisely, the agents' valuations are given by the inner product of a context vector with two unknown $d$-dimensional vectors -- one for the buyers and one for the sellers. At each time step $t$, the learner receives a context and posts two prices, one for the seller and one for the buyer, and the trade happens if both agents accept their price. We study two objectives for this problem, gain from trade and profit, proving no-regret with respect to a surprisingly strong benchmark: the best omniscient dynamic strategy. In the natural scenario where the learner observes \emph{separately} whether the agents accept their price -- the so-called \emph{two-bit} feedback -- we design algorithms that achieve $O(d\log d)$ regret for gain from trade, and $O(d \log\log T + d\log d)$ regret for profit maximization. Both results are tight, up to the $\log(d)$ factor, and implement per-step budget balance, meaning that the learner never incurs negative profit. In the less informative \emph{one-bit} feedback model, the learner only observes whether a trade happens or not. For this scenario, we show that the tight two-bit regret regimes are still attainable, at the cost of allowing the learner to possibly incur a small negative profit of order $O(d\log d)$, which is notably independent of the time horizon. As a final set of results, we investigate the combination of one-bit feedback and per-step budget balance. There, we design an algorithm for gain from trade that suffers regret independent of the time horizon, but \emph{exponential} in the dimension $d$. For profit maximization, we maintain this exponential dependence on the dimension, which gets multiplied by a $\log T$ factor.

67.4DSMar 18

A Simpler Analysis for $\varepsilon$-Clairvoyant Flow Time Scheduling

Anupam Gupta, Haim Kaplan, Alexander Lindermayr et al.

We simplify the proof of the optimality of the Shortest Lower-Bound First (SLF) algorithm, introduced by Gupta, Kaplan, Lindermayr, SchlÃ¶ter, and Yingchareonthawornchai [FOCS'25], for minimizing the total flow time in the $\varepsilon$-clairvoyant setting.

78.7DSMar 15

Improved Online Hitting Set Algorithms for Structured and Geometric Set Systems

Sujoy Bhore, Anupam Gupta, Amit Kumar

In the online hitting set problem, sets arrive over time, and the algorithm has to maintain a subset of elements that hit all the sets seen so far. Alon, Awerbuch, Azar, Buchbinder, and Naor (SICOMP 2009) gave an algorithm with competitive ratio $O(\log n \log m)$ for the (general) online hitting set and set cover problems for $m$ sets and $n$ elements; this is known to be tight for efficient online algorithms. Given this barrier for general set systems, we ask: can we break this double-logarithmic phenomenon for online hitting set/set cover on structured and geometric set systems? We provide an $O(\log n \log\log n)$-competitive algorithm for the weighted online hitting set problem on set systems with linear shallow-cell complexity, replacing the double-logarithmic factor in the general result by effectively a single logarithmic term. As a consequence of our results we obtain the first bounds for weighted online hitting set for natural geometric set families, thereby answering open questions regarding the gap between general and geometric weighted online hitting set problems.

69.0DSMay 11

FPT Approximation Schemes for Min-Sum Radii and Min-Sum Diameters Clustering

Fabrizio Grandoni, Anupam Gupta, Jatin Yadav

In the classical Min-Sum Radii problem (MSR) we are given a set $X$ of $n$ points in a metric space and a positive integer $k\in [n]$. Our goal is to partition $X$ into $k$ subsets (the clusters) so as to minimize the sum of the radii of these clusters. The Min-Sum Diameters problem (MSD) is defined analogously, where instead of the radii of the clusters we consider their diameters. For both problems we present FPT approximation schemes for the natural parameter $k$. Specifically, given $ε>0$, we show how to compute $(1+ε)$-approximations for both MSD and MSR in time $(1/ε)^kn^{O(1)}$ and $(1/ε)^{O(k/ε\log 1/ε)}n^{poly(1/ε)}$ respectively. The previous best FPT approximation algorithms for these problems have approximation factors $4+ε$ and $2+ε$, respectively, and finding an FPT approximation scheme for both these problems had been outstanding open problems.

GTMar 18, 2024

MAC Advice for Facility Location Mechanism Design

Zohar Barak, Anupam Gupta, Inbal Talgam-Cohen

Algorithms with predictions have attracted much attention in the last years across various domains, including variants of facility location, as a way to surpass traditional worst-case analyses. We study the $k$-facility location mechanism design problem, where the $n$ agents are strategic and might misreport their location. Unlike previous models, where predictions are for the $k$ optimal facility locations, we receive $n$ predictions for the locations of each of the agents. However, these predictions are only "mostly" and "approximately" correct (or MAC for short) -- i.e., some $δ$-fraction of the predicted locations are allowed to be arbitrarily incorrect, and the remainder of the predictions are allowed to be correct up to an $\varepsilon$-error. We make no assumption on the independence of the errors. Can such predictions allow us to beat the current best bounds for strategyproof facility location? We show that the $1$-median (geometric median) of a set of points is naturally robust under corruptions, which leads to an algorithm for single-facility location with MAC predictions. We extend the robustness result to a "balanced" variant of the $k$ facilities case. Without balancedness, we show that robustness completely breaks down, even for the setting of $k=2$ facilities on a line. For this "unbalanced" setting, we devise a truthful random mechanism that outperforms the best known result of Lu et al. [2010], which does not use predictions. En route, we introduce the problem of "second" facility location (when the first facility's location is already fixed). Our findings on the robustness of the $1$-median and more generally $k$-medians may be of independent interest, as quantitative versions of classic breakdown-point results in robust statistics.

LGFeb 2

Learning Markov Decision Processes under Fully Bandit Feedback

Zhengjia Zhuo, Anupam Gupta, Viswanath Nagarajan

A standard assumption in Reinforcement Learning is that the agent observes every visited state-action pair in the associated Markov Decision Process (MDP), along with the per-step rewards. Strong theoretical results are known in this setting, achieving nearly-tight $Θ(\sqrt{T})$-regret bounds. However, such detailed feedback can be unrealistic, and recent research has investigated more restricted settings such as trajectory feedback, where the agent observes all the visited state-action pairs, but only a single \emph{aggregate} reward. In this paper, we consider a far more restrictive ``fully bandit'' feedback model for episodic MDPs, where the agent does not even observe the visited state-action pairs -- it only learns the aggregate reward. We provide the first efficient bandit learning algorithm for episodic MDPs with $\widetilde{O}(\sqrt{T})$ regret. Our regret has an exponential dependence on the horizon length $\H$, which we show is necessary. We also obtain improved nearly-tight regret bounds for ``ordered'' MDPs; these can be used to model classical stochastic optimization problems such as $k$-item prophet inequality and sequential posted pricing. Finally, we evaluate the empirical performance of our algorithm for the setting of $k$-item prophet inequalities; despite the highly restricted feedback, our algorithm's performance is comparable to that of a state-of-art learning algorithm (UCB-VI) with detailed state-action feedback.

GTNov 14, 2024

Randomized Truthful Auctions with Learning Agents

Gagan Aggarwal, Anupam Gupta, Andres Perlroth et al.

We study a setting where agents use no-regret learning algorithms to participate in repeated auctions. \citet{kolumbus2022auctions} showed, rather surprisingly, that when bidders participate in second-price auctions using no-regret bidding algorithms, no matter how large the number of interactions $T$ is, the runner-up bidder may not converge to bidding truthfully. Our first result shows that this holds for \emph{general deterministic} truthful auctions. We also show that the ratio of the learning rates of the bidders can \emph{qualitatively} affect the convergence of the bidders. Next, we consider the problem of revenue maximization in this environment. In the setting with fully rational bidders, \citet{myerson1981optimal} showed that revenue can be maximized by using a second-price auction with reserves.We show that, in stark contrast, in our setting with learning bidders, \emph{randomized} auctions can have strictly better revenue guarantees than second-price auctions with reserves, when $T$ is large enough. Finally, we study revenue maximization in the non-asymptotic regime. We define a notion of {\em auctioneer regret} comparing the revenue generated to the revenue of a second price auction with truthful bids. When the auctioneer has to use the same auction throughout the interaction, we show an (almost) tight regret bound of $\smash{\widetilde Θ(T^{3/4})}.$ If the auctioneer can change auctions during the interaction, but in a way that is oblivious to the bids, we show an (almost) tight bound of $\smash{\widetilde Θ(\sqrt{T})}.$

CVFeb 8, 2024

Early Fusion of Features for Semantic Segmentation

Anupam Gupta, Ashok Krishnamurthy, Lisa Singh

This paper introduces a novel segmentation framework that integrates a classifier network with a reverse HRNet architecture for efficient image segmentation. Our approach utilizes a ResNet-50 backbone, pretrained in a semi-supervised manner, to generate feature maps at various scales. These maps are then processed by a reverse HRNet, which is adapted to handle varying channel dimensions through 1x1 convolutions, to produce the final segmentation output. We strategically avoid fine-tuning the backbone network to minimize memory consumption during training. Our methodology is rigorously tested across several benchmark datasets including Mapillary Vistas, Cityscapes, CamVid, COCO, and PASCAL-VOC2012, employing metrics such as pixel accuracy and mean Intersection over Union (mIoU) to evaluate segmentation performance. The results demonstrate the effectiveness of our proposed model in achieving high segmentation accuracy, indicating its potential for various applications in image analysis. By leveraging the strengths of both the ResNet-50 and reverse HRNet within a unified framework, we present a robust solution to the challenges of image segmentation.

CVMar 17, 2019

Spatiotemporal Filtering for Event-Based Action Recognition

Rohan Ghosh, Anupam Gupta, Andrei Nakagawa et al.

In this paper, we address the challenging problem of action recognition, using event-based cameras. To recognise most gestural actions, often higher temporal precision is required for sampling visual information. Actions are defined by motion, and therefore, when using event-based cameras it is often unnecessary to re-sample the entire scene. Neuromorphic, event-based cameras have presented an alternative to visual information acquisition by asynchronously time-encoding pixel intensity changes, through temporally precise spikes (10 micro-second resolution), making them well equipped for action recognition. However, other challenges exist, which are intrinsic to event-based imagers, such as higher signal-to-noise ratio, and a spatiotemporally sparse information. One option is to convert event-data into frames, but this could result in significant temporal precision loss. In this work we introduce spatiotemporal filtering in the spike-event domain, as an alternative way of channeling spatiotemporal information through to a convolutional neural network. The filters are local spatiotemporal weight matrices, learned from the spike-event data, in an unsupervised manner. We find that appropriate spatiotemporal filtering significantly improves CNN performance beyond state-of-the-art on the event-based DVS Gesture dataset. On our newly recorded action recognition dataset, our method shows significant improvement when compared with other, standard ways of generating the spatiotemporal filters.

CVMar 16, 2019

Spatiotemporal Feature Learning for Event-Based Vision

Rohan Ghosh, Anupam Gupta, Siyi Tang et al.

Unlike conventional frame-based sensors, event-based visual sensors output information through spikes at a high temporal resolution. By only encoding changes in pixel intensity, they showcase a low-power consuming, low-latency approach to visual information sensing. To use this information for higher sensory tasks like object recognition and tracking, an essential simplification step is the extraction and learning of features. An ideal feature descriptor must be robust to changes involving (i) local transformations and (ii) re-appearances of a local event pattern. To that end, we propose a novel spatiotemporal feature representation learning algorithm based on slow feature analysis (SFA). Using SFA, smoothly changing linear projections are learnt which are robust to local visual transformations. In order to determine if the features can learn to be invariant to various visual transformations, feature point tracking tasks are used for evaluation. Extensive experiments across two datasets demonstrate the adaptability of the spatiotemporal feature learner to translation, scaling and rotational transformations of the feature points. More importantly, we find that the obtained feature representations are able to exploit the high temporal resolution of such event-based cameras in generating better feature tracks.

LGFeb 22, 2019

Better Algorithms for Stochastic Bandits with Adversarial Corruptions

Anupam Gupta, Tomer Koren, Kunal Talwar

We study the stochastic multi-armed bandits problem in the presence of adversarial corruption. We present a new algorithm for this problem whose regret is nearly optimal, substantially improving upon previous work. Our algorithm is agnostic to the level of adversarial contamination and can tolerate a significant amount of corruption with virtually no degradation in performance.

LGDec 13, 2017

Potential-Function Proofs for First-Order Methods

Nikhil Bansal, Anupam Gupta

This note discusses proofs for convergence of first-order methods based on simple potential-function arguments. We cover methods like gradient descent (for both smooth and non-smooth settings), mirror descent, and some accelerated variants.

LGJun 4, 2017

Nearly Optimal Sampling Algorithms for Combinatorial Pure Exploration

Lijie Chen, Anupam Gupta, Jian Li et al.

We study the combinatorial pure exploration problem Best-Set in stochastic multi-armed bandits. In a Best-Set instance, we are given $n$ arms with unknown reward distributions, as well as a family $\mathcal{F}$ of feasible subsets over the arms. Our goal is to identify the feasible subset in $\mathcal{F}$ with the maximum total mean using as few samples as possible. The problem generalizes the classical best arm identification problem and the top-$k$ arm identification problem, both of which have attracted significant attention in recent years. We provide a novel instance-wise lower bound for the sample complexity of the problem, as well as a nontrivial sampling algorithm, matching the lower bound up to a factor of $\ln|\mathcal{F}|$. For an important class of combinatorial families, we also provide polynomial time implementation of the sampling algorithm, using the equivalence of separation and optimization for convex program, and approximate Pareto curves in multi-objective optimization. We also show that the $\ln|\mathcal{F}|$ factor is inevitable in general through a nontrivial lower bound construction. Our results significantly improve several previous results for several important combinatorial constraints, and provide a tighter understanding of the general Best-Set problem. We further introduce an even more general problem, formulated in geometric terms. We are given $n$ Gaussian arms with unknown means and unit variance. Consider the $n$-dimensional Euclidean space $\mathbb{R}^n$, and a collection $\mathcal{O}$ of disjoint subsets. Our goal is to determine the subset in $\mathcal{O}$ that contains the $n$-dimensional vector of the means. The problem generalizes most pure exploration bandit problems studied in the literature. We provide the first nearly optimal sample complexity upper and lower bounds for the problem.

LGMay 23, 2016

Pure Exploration of Multi-armed Bandit Under Matroid Constraints

Lijie Chen, Anupam Gupta, Jian Li

We study the pure exploration problem subject to a matroid constraint (Best-Basis) in a stochastic multi-armed bandit game. In a Best-Basis instance, we are given $n$ stochastic arms with unknown reward distributions, as well as a matroid $\mathcal{M}$ over the arms. Let the weight of an arm be the mean of its reward distribution. Our goal is to identify a basis of $\mathcal{M}$ with the maximum total weight, using as few samples as possible. The problem is a significant generalization of the best arm identification problem and the top-$k$ arm identification problem, which have attracted significant attentions in recent years. We study both the exact and PAC versions of Best-Basis, and provide algorithms with nearly-optimal sample complexities for these versions. Our results generalize and/or improve on several previous results for the top-$k$ arm identification problem and the combinatorial pure exploration problem when the combinatorial constraint is a matroid.

CROct 30, 2013

Some Efficient Solutions to Yao's Millionaire Problem

Ashish Kumar, Anupam Gupta

We present three simple and efficient protocol constructions to solve Yao's Millionaire Problem when the parties involved are non-colluding and semi-honest. The first construction uses a partially homomorphic Encryption Scheme and is a 4-round scheme using 2 encryptions, 2 homomorphic circuit evaluations (subtraction and XOR) and a single decryption. The second construction uses an untrusted third party and achieves a communication overhead linear in input bit-size with the help of an order preserving function.Moreover, the second construction does not require an apriori input bound and can work on inputs of different bit-sizes. The third construction does not use a third party and, even though, it has a quadratic communication overhead, it is a fairly simple construction.