Near-optimal sample compression for nearest neighbors
This work addresses a foundational problem in machine learning for improving nearest neighbor classification efficiency and theory.
The paper tackles the problem of sample compression for nearest neighbor classification by presenting the first algorithm with non-trivial performance guarantees, complemented by nearly matching lower bounds, and shows empirical results.
We present the first sample compression algorithm for nearest neighbors with non-trivial performance guarantees. We complement these guarantees by demonstrating almost matching hardness lower bounds, which show that our bound is nearly optimal. Our result yields new insight into margin-based nearest neighbor classification in metric spaces and allows us to significantly sharpen and simplify existing bounds. Some encouraging empirical results are also presented.