Labeled compression schemes for extremal classes
This work provides a theoretical advance in sample compression for concept classes, potentially impacting machine learning theory by extending known results to broader classes, though it is incremental relative to prior work on maximum classes.
The paper tackles the problem of constructing compression schemes for extremal classes, a generalization of maximum classes, and achieves a compression scheme size equal to their VC dimension, addressing a long-standing open question in learning theory.
It is a long-standing open problem whether there always exists a compression scheme whose size is of the order of the Vapnik-Chervonienkis (VC) dimension $d$. Recently compression schemes of size exponential in $d$ have been found for any concept class of VC dimension $d$. Previously, compression schemes of size $d$ have been given for maximum classes, which are special concept classes whose size equals an upper bound due to Sauer-Shelah. We consider a generalization of maximum classes called extremal classes. Their definition is based on a powerful generalization of the Sauer-Shelah bound called the Sandwich Theorem, which has been studied in several areas of combinatorics and computer science. The key result of the paper is a construction of a sample compression scheme for extremal classes of size equal to their VC dimension. We also give a number of open problems concerning the combinatorial structure of extremal classes and the existence of unlabeled compression schemes for them.