Christian Komusiewicz

DS
9papers
62citations
Novelty51%
AI Score53

9 Papers

75.3CCMay 4
The Parameter Report: An Orientation Guide for Data-Driven Parameterization

Christian Komusiewicz, Nils Morawietz, Frank Sommer et al.

A strength of parameterized algorithmics is that each problem can be parameterized by an essentially inexhaustible set of parameters. Usually, the choice of the considered parameter is informed by the theoretical relations between parameters with the general goal of achieving FPT-algorithms for smaller and smaller parameters. However, the FPT-algorithms for smaller parameters usually have higher running times and it is unclear whether the decrease in the parameter value or the increase in the running time bound dominates in real-world data. This question cannot be answered from purely theoretical considerations and any answer requires knowledge on typical parameter values. To provide a data-driven guideline for parameterized complexity studies of graph problems, we present the first comprehensive comparison of parameter values for a set of benchmark graphs originating from real-world applications. Our study covers degree-related parameters, such as maximum degree or degeneracy, neighborhood-based parameters such as neighborhood diversity and modular-width, modulator-based parameters such as vertex cover number and feedback vertex set number, and the treewidth of the graphs. Our results may help assess the significance of FPT-running time bounds on the solvability of real-world instances. For example, the vertex cover number $vc$ of $n$-vertex graphs is often only slightly below $n/2$. Thus, a running time bound of $O(2^{vc})$ is only slightly better than a running time bound of $O(1.4^{n})$. In contrast, the treewidth $tw$ is almost always below $n/3$ and often close to $n/10$, making a running time of $O(2^{tw})$ much more practical on real-world instances. We make our implementation and full experimental data openly available. In particular, this provides the first implementations for several graph parameters such as 4-path vertex cover number and vertex integrity.

DSApr 6, 2022
Efficient Bayesian Network Structure Learning via Parameterized Local Search on Topological Orderings

Niels Grüttemeier, Christian Komusiewicz, Nils Morawietz

In Bayesian Network Structure Learning (BNSL), one is given a variable set and parent scores for each variable and aims to compute a DAG, called Bayesian network, that maximizes the sum of parent scores, possibly under some structural constraints. Even very restricted special cases of BNSL are computationally hard, and, thus, in practice heuristics such as local search are used. A natural approach for a local search algorithm is a hill climbing strategy, where one replaces a given BNSL solution by a better solution within some pre-defined neighborhood as long as this is possible. We study ordering-based local search, where a solution is described via a topological ordering of the variables. We show that given such a topological ordering, one can compute an optimal DAG whose ordering is within inversion distance $r$ in subexponential FPT time; the parameter $r$ allows to balance between solution quality and running time of the local search algorithm. This running time bound can be achieved for BNSL without structural constraints and for all structural constraints that can be expressed via a sum of weights that are associated with each parent set. We also introduce a related distance called `window inversions distance' and show that the corresponding local search problem can also be solved in subexponential FPT time for the parameter $r$. For two further natural modification operations on the variable orderings, we show that algorithms with an FPT time for $r$ are unlikely. We also outline the limits of ordering-based local search by showing that it cannot be used for common structural constraints on the moralized graph of the network.

LGJun 7, 2023
On Computing Optimal Tree Ensembles

Christian Komusiewicz, Pascal Kunz, Frank Sommer et al.

Random forests and, more generally, (decision\nobreakdash-)tree ensembles are widely used methods for classification and regression. Recent algorithmic advances allow to compute decision trees that are optimal for various measures such as their size or depth. We are not aware of such research for tree ensembles and aim to contribute to this area. Mainly, we provide two novel algorithms and corresponding lower bounds. First, we are able to carry over and substantially improve on tractability results for decision trees: We obtain an algorithm that, given a training-data set and an size bound $S \in \mathbb{R}$, computes a tree ensemble of size at most $S$ that classifies the data correctly. The algorithm runs in $(4δD S)^S \cdot poly$-time, where $D$ the largest domain size, $δ$ is the largest number of features in which two examples differ, $n$ the number of input examples, and $poly$ a polynomial of the input size. For decision trees, that is, ensembles of size 1, we obtain a running time of $(δD s)^s \cdot poly$, where $s$ is the size of the tree. To obtain these algorithms, we introduce the witness-tree technique, which seems promising for practical implementations. Secondly, we show that dynamic programming, which has been applied successfully to computing decision trees, may also be viable for tree ensembles, providing an $\ell^n \cdot poly$-time algorithm, where $\ell$ is the number of trees. Finally, we compare the number of cuts necessary to classify training data sets for decision trees and tree ensembles, showing that ensembles may need exponentially fewer cuts for increasing number of trees.

37.3LOMar 23
The Descriptive Complexity of Relation Modification Problems

Florian Chudigiewitsch, Marlene Gründel, Christian Komusiewicz et al.

A relation modification problem gets a logical structure and a natural number k as input and asks whether k modifications of the structure suffice to make it satisfy a predefined property. We provide a complete classification of the classical and parameterized complexity of relation modification problems - the latter w. r. t. the modification budget k - based on the descriptive complexity of the respective target property. We consider different types of logical structures on which modifications are performed: Whereas monadic structures and undirected graphs without self-loops each yield their own complexity landscapes, we find that modifying undirected graphs with self-loops, directed graphs, or arbitrary logical structures is equally hard w. r. t. quantifier patterns. Moreover, we observe that all classes of problems considered in this paper are subject to a strong dichotomy in the sense that they are either very easy to solve (that is, they lie in paraAC^{0\uparrow} or TC^0) or intractable (that is, they contain W[2]-hard or NP-hard problems).

54.2DSMay 13
Clustering with Locally Bounded Ignorance

Jaroslav Garvardt, Christian Komusiewicz

In Correlation Clustering, the input is a graph $G=(V,E)$ with weight function $ω: {V \choose 2}\to Z$ and the task is to partition the vertex set into clusters such that the total weight of edges between clusters and missing edges inside clusters is minimized. Due to close connections between Correlation Clustering and Edge Multicut, deciding whether there is a partition with total cost at most $k$ is FPT with respect to $k$ but a polynomial kernel is presumably impossible. We study the influence of the structure of the fuzzy edge graph, that is, the graph induced by the weight-0 edges, on the problem complexity. We show in particular that Correlation Clustering admits a polynomial problem kernel when parameterized by $k+d$, where $d$ is the degeneracy of the fuzzy edge graph, and when parameterized by $k+c$, where $c$ is the closure of the fuzzy edge graph. We complement these positive results by showing hardness for several settings where the graph induced by the edges and nonedges has very restricted structure.

38.0DSMay 8
Parameterized Local Search for Vertex Cover: When only the Search Radius is Crucial

Christian Komusiewicz, Nils Morawietz

A vertex set $W$ in a graph $G$ is a valid $k$-swap for a vertex cover $S$ of $G$ if $W$ has size at most $k$ and $S'=(S \setminus W) \cup (W \setminus S)$, the symmetric difference of $S$ and $W$, is a vertex cover of $G$. If $|S'| < |S|$, then $W$ is improving. In LS Vertex Cover, one is given a vertex cover $S$ of a graph $G$ and wants to know if there is a valid improving $k$-swap for $S$ in $G$. In applications of LS Vertex Cover, $k$ is a very small parameter that can be set by a user to determine the trade-off between running time and solution quality. Consequently, $k$ can be considered to be a constant. Motivated by this and the fact that LS Vertex Cover is W[1]-hard with respect to $k$, we aim for algorithms with running time $\ell^{f(k)}\cdot n^{\mathcal{O}(1)}$ where $\ell$ is a structural graph parameter upper-bounded by $n$. We say that such a running time grows mildly with respect to $\ell$ and strongly with respect to $k$. We obtain algorithms with such a running time for $\ell$ being the $h$-index of $G$, the treewidth of $G$, or the modular-width of $G$. In addition, we consider a novel parameter, the maximum degree over all quotient graphs in a modular decomposition of $G$. Moreover, we adapt these algorithms to the more general problem where each vertex is assigned a weight and where we want to find a valid $d$-improving $k$-swap, that is, a valid $k$-swap which decreases the weight of the vertex cover by at least $d$.

4.7CCApr 25
On the Hardness of Finding Temporally Connected Subgraphs of Any Size

Arnaud Casteigts, Christian Komusiewicz, Nils Morawietz

Temporal graphs are graphs whose edges are only present at certain points in time. Reachability in these graphs relies on temporal paths, where edges are traversed chronologically. A temporal graph that offers all-pairs reachability is said to be temporally connected (or TC). For temporal graphs that are not TC, a natural question is whether they admit a TC subgraph (a.k.a. closed temporal component) of a given size $k$. This question was one of the earliest in the field, shown to be NP-hard by Bhadra and Ferreira in 2003. We strengthen this result dramatically, showing that deciding if a TC subgraph exists on at least $3$ vertices is already NP-hard in all the standard temporal graph settings (directed/undirected and strict/non-strict through simple and proper reductions). This implies a strong separation between closed temporal components and open temporal components (where temporal paths can travel outside the component), for which inclusion-maximal components can be found in polynomial time. As a by-product, our reductions strengthen a number of existing results and establish new derived results. They imply that the size of the largest TC subgraph cannot be approximated within a factor of $(1-ε)n$ in directed graphs, and within a factor of $(1-ε)\frac{n}{2}$ in undirected graphs. One of the reductions also completes the complexity landscape for TC subgraphs of size exactly $k$ when parameterized by $k$ (answering the missing non-strict case). Finally, on the structural side, our results imply that there exist arbitrarily large TC graphs of constant lifetime without nontrivial TC subgraphs, and we also show that there exist TC graphs of arbitrary girth, both facts being of independent interest.

DSMay 20, 2021
On the Parameterized Complexity of Polytree Learning

Niels Grüttemeier, Christian Komusiewicz, Nils Morawietz

A Bayesian network is a directed acyclic graph that represents statistical dependencies between variables of a joint probability distribution. A fundamental task in data science is to learn a Bayesian network from observed data. \textsc{Polytree Learning} is the problem of learning an optimal Bayesian network that fulfills the additional property that its underlying undirected graph is a forest. In this work, we revisit the complexity of \textsc{Polytree Learning}. We show that \textsc{Polytree Learning} can be solved in $3^n \cdot |I|^{\mathcal{O}(1)}$ time where $n$ is the number of variables and $|I|$ is the total instance size. Moreover, we consider the influence of the number of variables $d$ that might receive a nonempty parent set in the final DAG on the complexity of \textsc{Polytree Learning}. We show that \textsc{Polytree Learning} has no $f(d)\cdot |I|^{\mathcal{O}(1)}$-time algorithm, unlike Bayesian network learning which can be solved in $2^d \cdot |I|^{\mathcal{O}(1)}$ time. We show that, in contrast, if $d$ and the maximum parent set size are bounded, then we can obtain efficient algorithms.

DSApr 30, 2020
Learning Bayesian Networks Under Sparsity Constraints: A Parameterized Complexity Analysis

Niels Grüttemeier, Christian Komusiewicz

We study the problem of learning the structure of an optimal Bayesian network when additional constraints are posed on the network or on its moralized graph. More precisely, we consider the constraint that the network or its moralized graph are close, in terms of vertex or edge deletions, to a sparse graph class $Π$. For example, we show that learning an optimal network whose moralized graph has vertex deletion distance at most $k$ from a graph with maximum degree 1 can be computed in polynomial time when $k$ is constant. This extends previous work that gave an algorithm with such a running time for the vertex deletion distance to edgeless graphs [Korhonen & Parviainen, NIPS 2015]. We then show that further extensions or improvements are presumably impossible. For example, we show that learning optimal networks where the network or its moralized graph have maximum degree $2$ or connected components of size at most $c$, $c\ge 3$, is NP-hard. Finally, we show that learning an optimal network with at most $k$ edges in the moralized graph presumably has no $f(k)\cdot |I|^{O(1)}$-time algorithm and that, in contrast, an optimal network with at most $k$ arcs can be computed in $2^{O(k)}\cdot |I|^{O(1)}$ time where $|I|$ is the total input size.