DMJun 27, 2022
Sample compression schemes for balls in graphsJérémie Chalopin, Victor Chepoi, Fionn Mc Inerney et al.
One of the open problems in machine learning is whether any set-family of VC-dimension $d$ admits a sample compression scheme of size $O(d)$. In this paper, we study this problem for balls in graphs. For a ball $B=B_r(x)$ of a graph $G=(V,E)$, a realizable sample for $B$ is a signed subset $X=(X^+,X^-)$ of $V$ such that $B$ contains $X^+$ and is disjoint from $X^-$. A proper sample compression scheme of size $k$ consists of a compressor and a reconstructor. The compressor maps any realizable sample $X$ to a subsample $X'$ of size at most $k$. The reconstructor maps each such subsample $X'$ to a ball $B'$ of $G$ such that $B'$ includes $X^+$ and is disjoint from $X^-$. For balls of arbitrary radius $r$, we design proper labeled sample compression schemes of size $2$ for trees, of size $3$ for cycles, of size $4$ for interval graphs, of size $6$ for trees of cycles, and of size $22$ for cube-free median graphs. For balls of a given radius, we design proper labeled sample compression schemes of size $2$ for trees and of size $4$ for interval graphs. We also design approximate sample compression schemes of size 2 for balls of $δ$-hyperbolic graphs.
CCSep 6, 2023
Non-Clashing Teaching Maps for Balls in GraphsJérémie Chalopin, Victor Chepoi, Fionn Mc Inerney et al.
Recently, Kirkpatrick et al. [ALT 2019] and Fallat et al. [JMLR 2023] introduced non-clashing teaching and showed it is the most efficient machine teaching model satisfying the Goldman-Mathias collusion-avoidance criterion. A teaching map $T$ for a concept class $\mathcal{C}$ assigns a (teaching) set $T(C)$ of examples to each concept $C \in \mathcal{C}$. A teaching map is non-clashing if no pair of concepts are consistent with the union of their teaching sets. The size of a non-clashing teaching map (NCTM) $T$ is the maximum size of a teaching set $T(C)$, $C \in \mathcal{C}$. The non-clashing teaching dimension NCTD$(\mathcal{C})$ of $\mathcal{C}$ is the minimum size of an NCTM for $\mathcal{C}$. NCTM$^+$ and NCTD$^+(\mathcal{C})$ are defined analogously, except the teacher may only use positive examples. We study NCTMs and NCTM$^+$s for the concept class $\mathcal{B}(G)$ consisting of all balls of a graph $G$. We show that the associated decision problem B-NCTD$^+$ for NCTD$^+$ is NP-complete in split, co-bipartite, and bipartite graphs. Surprisingly, we even prove that, unless the ETH fails, B-NCTD$^+$ does not admit an algorithm running in time $2^{2^{o(\text{vc})}}\cdot n^{O(1)}$, nor a kernelization algorithm outputting a kernel with $2^{o(\text{vc})}$ vertices, where vc is the vertex cover number of $G$. We complement these lower bounds with matching upper bounds. These are extremely rare results: it is only the second problem in NP to admit such a tight double-exponential lower bound parameterized by vc, and only one of very few problems to admit such an ETH-based conditional lower bound on the number of vertices in a kernel. For trees, interval graphs, cycles, and trees of cycles, we derive NCTM$^+$s or NCTMs for $\mathcal{B}(G)$ of size proportional to its VC-dimension, and for Gromov-hyperbolic graphs, we design an approximate NCTM$^+$ of size 2.
COMar 29
Geometry of ample/lopsided setsHans--Jürgen Bandelt, Victor Chepoi, Andreas Dress et al.
Lopsided sets were introduced by Jim Lawrence in 1983 when he studied the subsets of $\{-1,+1\}^E$ that encode the intersection pattern of a convex set $K$ with the orthants of ${\mathbb R}^E$. Lopsided sets have been independently rediscovered by several other authors, in particular by Andreas Dress in 1995, who called them \emph{ample} sets. Dress defined ample sets as the set families satisfying equality in a combinatorial inequality, which holds for all set families. In a previous article we characterized ample sets in various combinatorial and graph-theoretical ways. In this paper we study geometric realizations of ample sets as cubihedra (cube complexes), which yields several new characterizations. One such characterization establishes that the cubihedra of ample sets endowed with the intrinsic $\ell_1$-metric are exactly the isometric subspaces of $\ell_1$-spaces (which we call, weakly convex sets). We also view the barycenter maps of faces of cubihedra of ample sets as collections of $\{ \pm 1, 0\}$-sign vectors and, in analogy with the characterization of oriented matroids by the covectors and the cocircuits. Moreover, we characterize the collections of $\{ \pm 1, 0\}$-sign vectors corresponding to barycenter maps of all faces and all maximal faces of an ample set. Furthermore, we show that any ample set $\covectors\subseteq \{ -1,+1\}^E$ is realizable as the intersection pattern of a weakly convex set $K$ with the orthants of ${\mathbb R}^E$. All this testifies that the concept of ample sets is quite natural in the context of cube complexes.
LGJun 29, 2025
Efficient Algorithms for Learning and Compressing Monophonic Halfspaces in GraphsMarco Bressan, Victor Chepoi, Emmanuel Esposito et al.
Abstract notions of convexity over the vertices of a graph, and corresponding notions of halfspaces, have recently gained attention from the machine learning community. In this work we study monophonic halfspaces, a notion of graph halfspaces defined through closure under induced paths. Our main result is a $2$-satisfiability based decomposition theorem, which allows one to represent monophonic halfspaces as a disjoint union of certain vertex subsets. Using this decomposition, we achieve efficient and (nearly) optimal algorithms for various learning problems, such as teaching, active, and online learning. Most notably, we obtain a polynomial-time algorithm for empirical risk minimization. Independently of the decomposition theorem, we obtain an efficient, stable, and proper sample compression scheme. This makes monophonic halfspaces efficiently learnable with proper learners and linear error rate $1/\varepsilon$ in the realizable PAC setting. Our results answer open questions from the literature, and show a stark contrast with geodesic halfspaces, for which most of the said learning problems are NP-hard.
COOct 28, 2021
Labeled sample compression schemes for complexes of oriented matroidsVictor Chepoi, Kolja Knauer, Manon Philibert
We show that the topes of a complex of oriented matroids (abbreviated COM) of VC-dimension $d$ admit a proper labeled sample compression scheme of size $d$. This considerably extends results of Moran and Warmuth on ample classes, of Ben-David and Litman on affine arrangements of hyperplanes, and of the authors on complexes of uniform oriented matroids, and is a step towards the sample compression conjecture -- one of the oldest open problems in computational learning theory. On the one hand, our approach exploits the rich combinatorial cell structure of COMs via oriented matroid theory. On the other hand, viewing tope graphs of COMs as partial cubes creates a fruitful link to metric graph theory.
DMDec 5, 2018
Unlabeled sample compression schemes and corner peelings for ample and maximum classesJérémie Chalopin, Victor Chepoi, Shay Moran et al.
We examine connections between combinatorial notions that arise in machine learning and topological notions in cubical/simplicial geometry. These connections enable to export results from geometry to machine learning. Our first main result is based on a geometric construction by Tracy Hall (2004) of a partial shelling of the cross-polytope which can not be extended. We use it to derive a maximum class of VC dimension 3 that has no corners. This refutes several previous works in machine learning from the past 11 years. In particular, it implies that all previous constructions of optimal unlabeled sample compression schemes for maximum classes are erroneous. On the positive side we present a new construction of an unlabeled sample compression scheme for maximum classes. We leave as open whether our unlabeled sample compression scheme extends to ample (a.k.a. lopsided or extremal) classes, which represent a natural and far-reaching generalization of maximum classes. Towards resolving this question, we provide a geometric characterization in terms of unique sink orientations of the 1-skeletons of associated cubical complexes.