LG MLFeb 5, 2020

Nested Barycentric Coordinate System as an Explicit Feature Map

Lee-Ad Gottlieb, Eran Kaufman, Aryeh Kontorovich, Gabriel Nivasch, Ofir Pele

arXiv:2002.01999v11.2

Originality Highly original

AI Analysis

This provides a novel embedding approach for high-sample, low-dimensional data that achieves better performance than existing kernel methods.

The authors tackled the problem of embedding data when sample size greatly exceeds ambient dimension by partitioning space into simplices and using barycentric coordinates as features, then training a linear classifier in this feature space. Their method consistently outperformed popular kernel embedding methods in empirical evaluations.

We propose a new embedding method which is particularly well-suited for settings where the sample size greatly exceeds the ambient dimension. Our technique consists of partitioning the space into simplices and then embedding the data points into features corresponding to the simplices' barycentric coordinates. We then train a linear classifier in the rich feature space obtained from the simplices. The decision boundary may be highly non-linear, though it is linear within each simplex (and hence piecewise-linear overall). Further, our method can approximate any convex body. We give generalization bounds based on empirical margin and a novel hybrid sample compression technique. An extensive empirical evaluation shows that our method consistently outperforms a range of popular kernel embedding methods.

View on arXiv PDF

Similar