DMJun 27, 2022
Sample compression schemes for balls in graphsJérémie Chalopin, Victor Chepoi, Fionn Mc Inerney et al.
One of the open problems in machine learning is whether any set-family of VC-dimension $d$ admits a sample compression scheme of size $O(d)$. In this paper, we study this problem for balls in graphs. For a ball $B=B_r(x)$ of a graph $G=(V,E)$, a realizable sample for $B$ is a signed subset $X=(X^+,X^-)$ of $V$ such that $B$ contains $X^+$ and is disjoint from $X^-$. A proper sample compression scheme of size $k$ consists of a compressor and a reconstructor. The compressor maps any realizable sample $X$ to a subsample $X'$ of size at most $k$. The reconstructor maps each such subsample $X'$ to a ball $B'$ of $G$ such that $B'$ includes $X^+$ and is disjoint from $X^-$. For balls of arbitrary radius $r$, we design proper labeled sample compression schemes of size $2$ for trees, of size $3$ for cycles, of size $4$ for interval graphs, of size $6$ for trees of cycles, and of size $22$ for cube-free median graphs. For balls of a given radius, we design proper labeled sample compression schemes of size $2$ for trees and of size $4$ for interval graphs. We also design approximate sample compression schemes of size 2 for balls of $δ$-hyperbolic graphs.
CCSep 6, 2023
Non-Clashing Teaching Maps for Balls in GraphsJérémie Chalopin, Victor Chepoi, Fionn Mc Inerney et al.
Recently, Kirkpatrick et al. [ALT 2019] and Fallat et al. [JMLR 2023] introduced non-clashing teaching and showed it is the most efficient machine teaching model satisfying the Goldman-Mathias collusion-avoidance criterion. A teaching map $T$ for a concept class $\mathcal{C}$ assigns a (teaching) set $T(C)$ of examples to each concept $C \in \mathcal{C}$. A teaching map is non-clashing if no pair of concepts are consistent with the union of their teaching sets. The size of a non-clashing teaching map (NCTM) $T$ is the maximum size of a teaching set $T(C)$, $C \in \mathcal{C}$. The non-clashing teaching dimension NCTD$(\mathcal{C})$ of $\mathcal{C}$ is the minimum size of an NCTM for $\mathcal{C}$. NCTM$^+$ and NCTD$^+(\mathcal{C})$ are defined analogously, except the teacher may only use positive examples. We study NCTMs and NCTM$^+$s for the concept class $\mathcal{B}(G)$ consisting of all balls of a graph $G$. We show that the associated decision problem B-NCTD$^+$ for NCTD$^+$ is NP-complete in split, co-bipartite, and bipartite graphs. Surprisingly, we even prove that, unless the ETH fails, B-NCTD$^+$ does not admit an algorithm running in time $2^{2^{o(\text{vc})}}\cdot n^{O(1)}$, nor a kernelization algorithm outputting a kernel with $2^{o(\text{vc})}$ vertices, where vc is the vertex cover number of $G$. We complement these lower bounds with matching upper bounds. These are extremely rare results: it is only the second problem in NP to admit such a tight double-exponential lower bound parameterized by vc, and only one of very few problems to admit such an ETH-based conditional lower bound on the number of vertices in a kernel. For trees, interval graphs, cycles, and trees of cycles, we derive NCTM$^+$s or NCTMs for $\mathcal{B}(G)$ of size proportional to its VC-dimension, and for Gromov-hyperbolic graphs, we design an approximate NCTM$^+$ of size 2.