On the Spectrum of Random Features Maps of High Dimensional Data
This work addresses the challenge of understanding and optimizing random feature techniques in machine learning, but it is incremental as it builds on existing theory without introducing a new paradigm.
The paper tackles the problem of analyzing the Gram matrix of random feature maps for high-dimensional Gaussian mixture models, using random matrix theory to reveal the interplay between nonlinearity and data statistics, which aids in tuning random feature-based methods.
Random feature maps are ubiquitous in modern statistical machine learning, where they generalize random projections by means of powerful, yet often difficult to analyze nonlinear operators. In this paper, we leverage the "concentration" phenomenon induced by random matrix theory to perform a spectral analysis on the Gram matrix of these random feature maps, here for Gaussian mixture models of simultaneously large dimension and size. Our results are instrumental to a deeper understanding on the interplay of the nonlinearity and the statistics of the data, thereby allowing for a better tuning of random feature-based techniques.