Sanjeev Khanna

DS
7papers
143citations
Novelty64%
AI Score54

7 Papers

LGMay 2, 2022
A Sharp Memory-Regret Trade-Off for Multi-Pass Streaming Bandits

Arpit Agarwal, Sanjeev Khanna, Prathamesh Patil

The stochastic $K$-armed bandit problem has been studied extensively due to its applications in various domains ranging from online advertising to clinical trials. In practice however, the number of arms can be very large resulting in large memory requirements for simultaneously processing them. In this paper we consider a streaming setting where the arms are presented in a stream and the algorithm uses limited memory to process these arms. Here, the goal is not only to minimize regret, but also to do so in minimal memory. Previous algorithms for this problem operate in one of the two settings: they either use $Ω(\log \log T)$ passes over the stream (Rathod, 2021; Chaudhuri and Kalyanakrishnan, 2020; Liau et al., 2018), or just a single pass (Maiti et al., 2021). In this paper we study the trade-off between memory and regret when $B$ passes over the stream are allowed, for any $B \geq 1$, and establish tight regret upper and lower bounds for any $B$-pass algorithm. Our results uncover a surprising *sharp transition phenomenon*: $O(1)$ memory is sufficient to achieve $\widetildeΘ\Big(T^{\frac{1}{2} + \frac{1}{2^{B+2}-2}}\Big)$ regret in $B$ passes, and increasing the memory to any quantity that is $o(K)$ has almost no impact on further reducing this regret, unless we use $Ω(K)$ memory. Our main technical contribution is our lower bound which requires the use of information-theoretic techniques as well as ideas from round elimination to show that the *residual problem* remains challenging over subsequent passes.

DSJun 15, 2022
Sublinear Algorithms for Hierarchical Clustering

Arpit Agarwal, Sanjeev Khanna, Huan Li et al.

Hierarchical clustering over graphs is a fundamental task in data mining and machine learning with applications in domains such as phylogenetics, social network analysis, and information retrieval. Specifically, we consider the recently popularized objective function for hierarchical clustering due to Dasgupta. Previous algorithms for (approximately) minimizing this objective function require linear time/space complexity. In many applications the underlying graph can be massive in size making it computationally challenging to process the graph even using a linear time/space algorithm. As a result, there is a strong interest in designing algorithms that can perform global computation using only sublinear resources. The focus of this work is to study hierarchical clustering for massive graphs under three well-studied models of sublinear computation which focus on space, time, and communication, respectively, as the primary resources to optimize: (1) (dynamic) streaming model where edges are presented as a stream, (2) query model where the graph is queried using neighbor and degree queries, (3) MPC model where the graph edges are partitioned over several machines connected via a communication channel. We design sublinear algorithms for hierarchical clustering in all three models above. At the heart of our algorithmic results is a view of the objective in terms of cuts in the graph, which allows us to use a relaxed notion of cut sparsifiers to do hierarchical clustering while introducing only a small distortion in the objective function. Our main algorithmic contributions are then to show how cut sparsifiers of the desired form can be efficiently constructed in the query model and the MPC model. We complement our algorithmic results by establishing nearly matching lower bounds that rule out the possibility of designing better algorithms in each of these models.

DSApr 6
Improved Approximation Algorithms for Capacitated Network Design and Flexible Graph Connectivity

Ishan Bansal, Joseph Cheriyan, Sanjeev Khanna et al.

We present improved approximation algorithms for some problems in the related areas of Capacitated Network Design and Flexible Graph Connectivity. In the Cap-$k$-ECSS problem, we are given a graph $G=(V,E)$ whose edges have non-negative costs and positive integer capacities, and the goal is to find a minimum-cost edge-set $F$ such that every non-trivial cut of the graph $G'=(V,F)$ has capacity at least $k$. We present an $O(\log k)$-approximation algorithm for the Cap-$k$-ECSS problem, asymptotically improving upon the previous best approximation ratio of $\min(O(\log n),\; O(k))$ whenever $\log(k)=o(\log n)$, where $n$ denotes $|V|$. (See section 1, for a detailed discussion.) In the $(p,q)$-Flexible Graph Connectivity problem, denoted $(p,q)$-FGC, the input is a graph $G(V, E)$ where $E$ is partitioned into safe and unsafe edges, and the goal is to find a minimum cost set of edges $F$ such that the subgraph $G'(V, F)$ remains $p$-edge connected upon removal of any $q$ unsafe edges from $F$. We design a $7$-approximation algorithm for the $(1,q)$-FGC problem, improving on the previous best approximation ratio of $(q+1)$. Both of our results are obtained by using natural LP relaxations strengthened with the knapsack-cover inequalities, and then, during the rounding process, utilizing a recent $O(1)$-approximation algorithm for the Cover$\;$Small$\;$Cuts problem. In the latter problem, the goal is to find a minimum-cost set of links such that each non-trivial cut of capacity less than a specified value is covered by a link. We also show that the problem of covering small cuts inherently arises in another variant of $(p,q)$-FGC. Specifically, we give Cook reductions that preserve approximation ratios within $O(1)$ factors between the $(2,q)$-FGC problem and the 2-Cover$\;$Small$\;$Cuts problem; in the latter problem, each small cut needs to be covered by two links.

DSMar 25
Fault-Tolerant Distance Oracles Below the $n \cdot f$ Barrier

Sanjeev Khanna, Christian Konrad, Aaron Putterman

Fault-tolerant spanners are fundamental objects that preserve distances in graphs even under edge failures. A long line of work culminating in Bodwin, Dinitz, Robelle (SODA 2022) gives $(2k-1)$-stretch, $f$-fault-tolerant spanners with $O(k^2 f^{\frac{1}{2}-\frac{1}{2k}} n^{1+\frac{1}{k}} + k f n)$ edges for any odd $k$. For any $k = \tilde{O}(1)$, this bound is essentially optimal for deterministic spanners in part due to a known folklore lower bound that \emph{any} $f$-fault-tolerant spanner requires $Ω(nf)$ edges in the worst case. For $f \geq n$, this $Ω(nf)$ barrier means that any $f$-fault tolerant spanners are trivial in size. Crucially however, this folklore lower bound exploits that the spanner \emph{is itself a subgraph}. It does not rule out distance-reporting data structures that may not be subgraphs. This leads to our central question: can one beat the $n \cdot f$ barrier with fault-tolerant distance oracles? We give a strong affirmative answer to this question. As our first contribution, we construct $f$-fault-tolerant distance oracles with stretch $O(\log(n)\log\log(n))$ that require only $\widetilde{O}(n\sqrt{f})$ bits of space; substantially below the spanner barrier of $n \cdot f$. Beyond this, in the regime $n \leq f \leq n^{3/2}$ we show that by using our new \emph{high-degree, low-diameter} decomposition in combination with tools from sparse recovery, we can even obtain stretch $7$ distance oracles in space $\widetilde{O}(n^{3/2}f^{1/3})$ bits. We also show that our techniques are sufficiently general to yield randomized sketches for fault-tolerant ``oblivious'' spanners and fault-tolerant deterministic distance oracles in bounded-deletion streams, with space below the $nf$ barrier in both settings.

DSMay 5
An $\widetilde{O} (n^{3/7})$ Round Parallel Algorithm for Matroid Bases

Sanjeev Khanna, Aaron Putterman, Junkai Song

We study the parallel (adaptive) complexity of the classic problem of finding a basis in an $n$-element matroid, given access via an \emph{independence oracle}. In this model, the algorithm may submit polynomially many independence queries in each round, and the central question is: how many rounds are necessary and sufficient to find a basis? Karp, Upfal, and Wigderson (FOCS~1985, JCSS~1988; hereafter KUW) initiated this study, showing that $O(\sqrt{n})$ adaptive rounds suffice for any matroid, and that $\widetildeΩ(n^{1/3})$ rounds are necessary even for partition matroids. This left a substantial gap that persisted for nearly four decades, until Khanna, Putterman, and Song (FOCS~2025; hereafter KPS) achieved $\widetilde O(n^{7/15})$ rounds, the first improvement since~KUW. In this work, we make another conceptual advance beyond KPS, giving a new algorithm that finds a matroid basis in $\widetilde O(n^{3/7})$ rounds. We develop a structural and algorithmic framework that brings a new lens to the analysis of random circuits, moving from reasoning about individual elements to understanding how dependencies span multiple elements simultaneously.

DSMay 1
A Faster Deterministic Algorithm for Fully Dynamic Maximal Matching

Julia Chuzhoy, Sanjeev Khanna, Junkai Song

In the fully dynamic maximal matching problem, the goal is to maintain a maximal matching in a graph undergoing an online sequence of edge insertions and deletions. The problem has been studied extensively in the oblivious-adversary setting, where randomized algorithms with polylogarithmic worst-case and constant amortized update time have been known for some time. A major challenge in this area has been designing an algorithm with non-trivial update time against an adaptive adversary. In a recent breakthrough, Bernstein, Bhattacharya, Kiss, and Saranurak (STOC 2025; hereafter, BBKS25) obtained the first algorithms with sublinear update time for this setting: namely, a randomized algorithm with $\tilde{O}(n^{3/4})$ amortized update time, and a deterministic algorithm with $\tilde{O}(n^{8/9})$ amortized update time. Our main result is a deterministic algorithm for fully dynamic maximal matching with amortized update time $n^{1/2+o(1)}$. A powerful tool in dynamic matching is the use of matching sparsifiers: sparse subgraphs that preserve enough information to recover matchings with desired properties. Sparsifiers, such as the EDCS data structure, have been successfully used for approximate maximum matching. For maximal matching, however, this paradigm is not as natural, since maximality must hold with respect to the entire graph. Nevertheless, BBKS25 showed that EDCS can be repurposed as a verification-and-repair mechanism for fully dynamic maximal matching against adaptive adversaries. We introduce a new deterministic framework, referred to as the subgraph system, which, in contrast to EDCS, is purpose-built for verification and maintenance of maximality. It is also designed to allow efficient recursive refinements leading to stronger and stronger parameters, that yield our deterministic algorithm with $n^{1/2+o(1)}$ amortized update time.

DSFeb 22, 2012
Distributed Private Heavy Hitters

Justin Hsu, Sanjeev Khanna, Aaron Roth

In this paper, we give efficient algorithms and lower bounds for solving the heavy hitters problem while preserving differential privacy in the fully distributed local model. In this model, there are n parties, each of which possesses a single element from a universe of size N. The heavy hitters problem is to find the identity of the most common element shared amongst the n parties. In the local model, there is no trusted database administrator, and so the algorithm must interact with each of the $n$ parties separately, using a differentially private protocol. We give tight information-theoretic upper and lower bounds on the accuracy to which this problem can be solved in the local model (giving a separation between the local model and the more common centralized model of privacy), as well as computationally efficient algorithms even in the case where the data universe N may be exponentially large.