MLLGJan 31, 2023

Simplex Random Features

Cambridge
arXiv:2301.13856v210 citationsh-index: 49
Originality Highly original
AI Analysis

This work provides a more accurate and efficient kernel approximation method for machine learning applications, such as scalable Transformers, though it is incremental within the random features domain.

The authors tackled the problem of approximating softmax and Gaussian kernels with random features, introducing Simplex Random Features (SimRFs) that achieve the smallest possible mean square error among unbiased estimates in a specific class, outperforming prior methods like Orthogonal Random Features without extra cost.

We present Simplex Random Features (SimRFs), a new random feature (RF) mechanism for unbiased approximation of the softmax and Gaussian kernels by geometrical correlation of random projection vectors. We prove that SimRFs provide the smallest possible mean square error (MSE) on unbiased estimates of these kernels among the class of weight-independent geometrically-coupled positive random feature (PRF) mechanisms, substantially outperforming the previously most accurate Orthogonal Random Features at no observable extra cost. We present a more computationally expensive SimRFs+ variant, which we prove is asymptotically optimal in the broader family of weight-dependent geometrical coupling schemes (which permit correlations between random vector directions and norms). In extensive empirical studies, we show consistent gains provided by SimRFs in settings including pointwise kernel estimation, nonparametric classification and scalable Transformers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes