Roohani Sharma

DS
3papers
28citations
Novelty48%
AI Score38

3 Papers

DSApr 6, 2023
Parameterized Approximation Schemes for Clustering with General Norm Objectives

Fateme Abbasi, Sandip Banerjee, Jarosław Byrka et al.

This paper considers the well-studied algorithmic regime of designing a $(1+ε)$-approximation algorithm for a $k$-clustering problem that runs in time $f(k,ε)poly(n)$ (sometimes called an efficient parameterized approximation scheme or EPAS for short). Notable results of this kind include EPASes in the high-dimensional Euclidean setting for $k$-center [Badŏiu, Har-Peled, Indyk; STOC'02] as well as $k$-median, and $k$-means [Kumar, Sabharwal, Sen; J. ACM 2010]. However, existing EPASes handle only basic objectives (such as $k$-center, $k$-median, and $k$-means) and are tailored to the specific objective and metric space. Our main contribution is a clean and simple EPAS that settles more than ten clustering problems (across multiple well-studied objectives as well as metric spaces) and unifies well-known EPASes. Our algorithm gives EPASes for a large variety of clustering objectives (for example, $k$-means, $k$-center, $k$-median, priority $k$-center, $\ell$-centrum, ordered $k$-median, socially fair $k$-median aka robust $k$-median, or more generally monotone norm $k$-clustering) and metric spaces (for example, continuous high-dimensional Euclidean spaces, metrics of bounded doubling dimension, bounded treewidth metrics, and planar metrics). Key to our approach is a new concept that we call bounded $ε$-scatter dimension--an intrinsic complexity measure of a metric space that is a relaxation of the standard notion of bounded doubling dimension. Our main technical result shows that two conditions are essentially sufficient for our algorithm to yield an EPAS on the input metric $M$ for any clustering objective: (i) The objective is described by a monotone (not necessarily symmetric!) norm, and (ii) the $ε$-scatter dimension of $M$ is upper bounded by a function of $ε$.

DSMar 23
A Dividing Line for Structural Kernelization of Component Order Connectivity via Distance to Bounded Pathwidth

Jakob Greilhuber, Roohani Sharma

In this work we study a classic generalization of the Vertex Cover (VC) problem, called the Component Order Connectivity (COC) problem. In COC, given an undirected graph $G$, integers $d \geq 1$ and $k$, the goal is to determine if there is a set of at most $k$ vertices whose deletion results in a graph where each connected component has at most $d$ vertices. When $d=1$, this is exactly VC. This work is inspired by polynomial kernelization results with respect to structural parameters for VC. On one hand, Jansen & Bodlaender [TOCS 2013] show that VC admits a polynomial kernel when the parameter is the distance to treewidth-$1$ graphs, on the other hand Cygan, Lokshtanov, Pilipczuk, Pilipczuk & Saurabh [TOCS 2014] showed that VC does not admit a polynomial kernel when the parameter is distance to treewidth-$2$ graphs. Greilhuber & Sharma [IPEC 2024] showed that, for any $d \geq 2$, $d$-COC cannot admit a polynomial kernel when the parameter is distance to a forest of pathwidth $2$. Here, $d$-COC is the same as COC only that $d$ is a fixed constant not part of the input. We complement this result and show that like for the VC problem where distance to treewidth-$1$ graphs versus distance to treewidth-$2$ graphs is the dividing line between structural parameterizations that allow and respectively disallow polynomial kernelization, for COC this dividing line happens between distance to pathwidth-$1$ graphs and distance to pathwidth-$2$ graphs. The main technical result of this work is that COC admits a polynomial kernel parameterized by distance to pathwidth-$1$ graphs plus $d$.

DSMay 12, 2023
Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces

Fateme Abbasi, Sandip Banerjee, Jarosław Byrka et al.

We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,δ)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that $\max_{i \in [m]} \sum_{p \in S_i} w(p) δ(p,X)^z$ is minimized. This problem arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For polynomial time computation, an approximation factor of $O(\log m/\log\log m)$ is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible complexity assumption even in the line metrics. For FPT time, there is a $(3^z+ε)$-approximation algorithm, which is tight under GAP-ETH [Goyal, Jaiswal, Inf. Proc. Letters, 2023]. Motivated by the tight lower bounds for general discrete metrics, we focus on \emph{geometric} spaces such as the (discrete) high-dimensional Euclidean setting and metrics of low doubling dimension, which play an important role in data analysis applications. First, for a universal constant $η_0 >0.0006$, we devise a $3^z(1-η_{0})$-factor FPT approximation algorithm for discrete high-dimensional Euclidean spaces thereby bypassing the lower bound for general metrics. We complement this result by showing that even the special case of $k$-Center in dimension $Θ(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to approximate for FPT algorithms. Finally, we complete the FPT approximation landscape by designing an FPT $(1+ε)$-approximation scheme (EPAS) for the metric of sub-logarithmic doubling dimension.