Compressive Mahalanobis Metric Learning Adapts to Intrinsic Dimension
This work addresses metric learning for distance-based algorithms in high-dimensional data, offering a method that adapts to intrinsic dimensionality, but it appears incremental as it builds on existing compression and metric learning techniques.
The paper tackles the problem of learning a Mahalanobis metric in high-dimensional settings by training a full-rank metric on randomly compressed data, achieving theoretical error bounds that depend on the intrinsic stable dimension rather than the ambient dimension, with numerical experiments supporting the findings.
Metric learning aims at finding a suitable distance metric over the input space, to improve the performance of distance-based learning algorithms. In high-dimensional settings, it can also serve as dimensionality reduction by imposing a low-rank restriction to the learnt metric. In this paper, we consider the problem of learning a Mahalanobis metric, and instead of training a low-rank metric on high-dimensional data, we use a randomly compressed version of the data to train a full-rank metric in this reduced feature space. We give theoretical guarantees on the error for Mahalanobis metric learning, which depend on the stable dimension of the data support, but not on the ambient dimension. Our bounds make no assumptions aside from i.i.d. data sampling from a bounded support, and automatically tighten when benign geometrical structures are present. An important ingredient is an extension of Gordon's theorem, which may be of independent interest. We also corroborate our findings by numerical experiments.