Empirical comparison between autoencoders and traditional dimensionality reduction methods
This work addresses the problem of efficient dimensionality reduction for practitioners, showing that traditional methods like PCA remain competitive, though it is incremental as it confirms existing knowledge with new benchmarks.
This paper compared PCA, Isomap, and autoencoders for dimensionality reduction in classification tasks, finding that PCA achieved comparable accuracy to autoencoders on image datasets like MNIST and CIFAR-10 but was two orders of magnitude faster in computation time.
In order to process efficiently ever-higher dimensional data such as images, sentences, or audio recordings, one needs to find a proper way to reduce the dimensionality of such data. In this regard, SVD-based methods including PCA and Isomap have been extensively used. Recently, a neural network alternative called autoencoder has been proposed and is often preferred for its higher flexibility. This work aims to show that PCA is still a relevant technique for dimensionality reduction in the context of classification. To this purpose, we evaluated the performance of PCA compared to Isomap, a deep autoencoder, and a variational autoencoder. Experiments were conducted on three commonly used image datasets: MNIST, Fashion-MNIST, and CIFAR-10. The four different dimensionality reduction techniques were separately employed on each dataset to project data into a low-dimensional space. Then a k-NN classifier was trained on each projection with a cross-validated random search over the number of neighbours. Interestingly, our experiments revealed that k-NN achieved comparable accuracy on PCA and both autoencoders' projections provided a big enough dimension. However, PCA computation time was two orders of magnitude faster than its neural network counterparts.