Recycling Randomness with Structure for Sublinear time Kernel Expansions
This work addresses computational bottlenecks in kernel methods for machine learning practitioners by providing a more flexible and efficient random feature framework.
The authors tackled the problem of scaling kernel methods by developing a framework that recycles Gaussian random vectors into structured matrices (Circulant, Toeplitz, Hankel, etc.) to approximate kernel functions in sublinear time via random embeddings. They proved unbiasedness and low-variance properties, with empirical results strongly supporting their theory and enabling broader use of structured matrices for efficient kernel approximations.
We propose a scheme for recycling Gaussian random vectors into structured matrices to approximate various kernel functions in sublinear time via random embeddings. Our framework includes the Fastfood construction as a special case, but also extends to Circulant, Toeplitz and Hankel matrices, and the broader family of structured matrices that are characterized by the concept of low-displacement rank. We introduce notions of coherence and graph-theoretic structural constants that control the approximation quality, and prove unbiasedness and low-variance properties of random feature maps that arise within our framework. For the case of low-displacement matrices, we show how the degree of structure and randomness can be controlled to reduce statistical variance at the cost of increased computation and storage requirements. Empirical results strongly support our theory and justify the use of a broader family of structured matrices for scaling up kernel methods using random features.