MLJul 29, 2013

Borel Isomorphic Dimensionality Reduction of Data and Supervised Learning

arXiv:1307.8333v13 citations

Originality Synthesis-oriented

AI Analysis

This work addresses dimensionality reduction for machine learning practitioners, but it is incremental as it builds on prior suggestions and focuses on specific examples.

The paper tackles the problem of dimensionality reduction for supervised learning by applying Borel isomorphisms, showing that on a 256-dimensional phoneme dataset, reduction to 16 dimensions results in minimal accuracy drop.

In this project we further investigate the idea of reducing the dimensionality of datasets using a Borel isomorphism with the purpose of subsequently applying supervised learning algorithms, as originally suggested by my supervisor V. Pestov (in 2011 Dagstuhl preprint). Any consistent learning algorithm, for example kNN, retains universal consistency after a Borel isomorphism is applied. A series of concrete examples of Borel isomorphisms that reduce the number of dimensions in a dataset is provided, based on multiplying the data by orthogonal matrices before the dimensionality reducing Borel isomorphism is applied. We test the accuracy of the resulting classifier in a lower dimensional space with various data sets. Working with a phoneme voice recognition dataset, of dimension 256 with 5 classes (phonemes), we show that a Borel isomorphic reduction to dimension 16 leads to a minimal drop in accuracy. In conclusion, we discuss further prospects of the method.

View on arXiv PDF

Similar