Dániel Marx

h-index46

4papers

39citations

Novelty55%

AI Score40

Ranked #75,029 of 194,257 authors (top 39%)#258 in DS (top 54%)

4 Papers

7.3DSApr 6, 2023

Parameterized Approximation Schemes for Clustering with General Norm Objectives

Fateme Abbasi, Sandip Banerjee, Jarosław Byrka et al.

This paper considers the well-studied algorithmic regime of designing a $(1+ε)$-approximation algorithm for a $k$-clustering problem that runs in time $f(k,ε)poly(n)$ (sometimes called an efficient parameterized approximation scheme or EPAS for short). Notable results of this kind include EPASes in the high-dimensional Euclidean setting for $k$-center [Badŏiu, Har-Peled, Indyk; STOC'02] as well as $k$-median, and $k$-means [Kumar, Sabharwal, Sen; J. ACM 2010]. However, existing EPASes handle only basic objectives (such as $k$-center, $k$-median, and $k$-means) and are tailored to the specific objective and metric space. Our main contribution is a clean and simple EPAS that settles more than ten clustering problems (across multiple well-studied objectives as well as metric spaces) and unifies well-known EPASes. Our algorithm gives EPASes for a large variety of clustering objectives (for example, $k$-means, $k$-center, $k$-median, priority $k$-center, $\ell$-centrum, ordered $k$-median, socially fair $k$-median aka robust $k$-median, or more generally monotone norm $k$-clustering) and metric spaces (for example, continuous high-dimensional Euclidean spaces, metrics of bounded doubling dimension, bounded treewidth metrics, and planar metrics). Key to our approach is a new concept that we call bounded $ε$-scatter dimension--an intrinsic complexity measure of a metric space that is a relaxation of the standard notion of bounded doubling dimension. Our main technical result shows that two conditions are essentially sufficient for our algorithm to yield an EPAS on the input metric $M$ for any clustering objective: (i) The objective is described by a monotone (not necessarily symmetric!) norm, and (ii) the $ε$-scatter dimension of $M$ is upper bounded by a function of $ε$.

7.4DSMar 17

The Price of Being Partial: Complexity of Partial Generalized Dominating Set on Bounded-Treewidth Graphs

Jakob Greilhuber, Dániel Marx

For fixed sets $Ï, Ï$ of non-negative integers, the $(Ï, Ï)$-domination framework introduced by Telle [Nord. J. Comput. 1994] captures many classical graph problems. For a graph $G$, a $(Ï,Ï)$-set is a set $S$ of vertices such that for every $v\in V(G)$, we have (1) if $v \in S$, then $|N(v) \cap S| \in Ï$, and (2) if $v \notin S$, then $|N(v) \cap S| \in Ï$. We initiate the study of a natural partial variant $(Ï,Ï)$-MinParDomSet of the problem, in which the constraints given by $Ï, Ï$ need not be fulfilled for all vertices, but we want to find a set of size at most $k$ that maximizes the number of vertices that are satisfied in the sense that they satisfy (1) or (2) above. Our goal is to understand whether $(Ï,Ï)$-MinParDomSet can be solved in the same running time as the nonpartial version, or whether it is strictly harder. Formally, we consider nonempty finite or simple cofinite sets $Ï$ and $Ï$ (simple cofinite sets are of the form $\mathbb{Z}_{\geq c}$), and we try to determine the smallest constant $c_{Ï,Ï}$ such that there is a $c_{Ï,Ï}^{tw}\cdot n^{O(1)}$ time algorithm for the problem if a tree decomposition of width $tw$ is given. We obtain matching upper and lower bounds on $c_{Ï,Ï}$ for every such fixed $Ï$ and $Ï$ under the Primal Pathwidth Strong Exponential Time Hypothesis, and establish whether the partial problem is harder than the nonpartial variant. For some sets $Ï$ and $Ï$, the more general $(Ï,Ï)$-MinParDomSet has the same complexity as the nonpartial special case (e.g., for Dominating Set), while for other choices, the partial version is significantly harder (e.g., for Perfect Code).

4.3DSMay 12, 2023

Parameterized Approximation for Robust Clustering in Discrete Geometric Spaces

Fateme Abbasi, Sandip Banerjee, Jarosław Byrka et al.

We consider the well-studied Robust $(k, z)$-Clustering problem, which generalizes the classic $k$-Median, $k$-Means, and $k$-Center problems. Given a constant $z\ge 1$, the input to Robust $(k, z)$-Clustering is a set $P$ of $n$ weighted points in a metric space $(M,δ)$ and a positive integer $k$. Further, each point belongs to one (or more) of the $m$ many different groups $S_1,S_2,\ldots,S_m$. Our goal is to find a set $X$ of $k$ centers such that $\max_{i \in [m]} \sum_{p \in S_i} w(p) δ(p,X)^z$ is minimized. This problem arises in the domains of robust optimization [Anthony, Goyal, Gupta, Nagarajan, Math. Oper. Res. 2010] and in algorithmic fairness. For polynomial time computation, an approximation factor of $O(\log m/\log\log m)$ is known [Makarychev, Vakilian, COLT $2021$], which is tight under a plausible complexity assumption even in the line metrics. For FPT time, there is a $(3^z+ε)$-approximation algorithm, which is tight under GAP-ETH [Goyal, Jaiswal, Inf. Proc. Letters, 2023]. Motivated by the tight lower bounds for general discrete metrics, we focus on \emph{geometric} spaces such as the (discrete) high-dimensional Euclidean setting and metrics of low doubling dimension, which play an important role in data analysis applications. First, for a universal constant $η_0 >0.0006$, we devise a $3^z(1-η_{0})$-factor FPT approximation algorithm for discrete high-dimensional Euclidean spaces thereby bypassing the lower bound for general metrics. We complement this result by showing that even the special case of $k$-Center in dimension $Θ(\log n)$ is $(\sqrt{3/2}- o(1))$-hard to approximate for FPT algorithms. Finally, we complete the FPT approximation landscape by designing an FPT $(1+ε)$-approximation scheme (EPAS) for the metric of sub-logarithmic doubling dimension.

5.4AIJan 16, 2014

Soft Constraints of Difference and Equality

Emmanuel Hebrard, Dániel Marx, Barry O'Sullivan et al.

In many combinatorial problems one may need to model the diversity or similarity of assignments in a solution. For example, one may wish to maximise or minimise the number of distinct values in a solution. To formulate problems of this type, we can use soft variants of the well known AllDifferent and AllEqual constraints. We present a taxonomy of six soft global constraints, generated by combining the two latter ones and the two standard cost functions, which are either maximised or minimised. We characterise the complexity of achieving arc and bounds consistency on these constraints, resolving those cases for which NP-hardness was neither proven nor disproven. In particular, we explore in depth the constraint ensuring that at least k pairs of variables have a common value. We show that achieving arc consistency is NP-hard, however achieving bounds consistency can be done in polynomial time through dynamic programming. Moreover, we show that the maximum number of pairs of equal variables can be approximated by a factor 1/2 with a linear time greedy algorithm. Finally, we provide a fixed parameter tractable algorithm with respect to the number of values appearing in more than two distinct domains. Interestingly, this taxonomy shows that enforcing equality is harder than enforcing difference.