LG MLOct 8, 2021

Hybrid Random Features

Krzysztof Choromanski, Haoxian Chen, Han Lin, Yuanzhe Ma, Arijit Sehanobish, Deepali Jain, Michael S Ryoo, Jake Varley, Andy Zeng, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani

arXiv:2110.04367v315.527 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides a more accurate kernel approximation method for machine learning applications, including Transformers and robotics, but is incremental as it builds on prior random feature techniques.

The paper tackles the problem of approximating softmax and Gaussian kernels with random features by introducing hybrid random features (HRFs) that adapt kernel estimation quality to regions of interest, resulting in unbiased approximation and strictly smaller worst-case relative errors compared to existing methods.

We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest. Special instantiations of HRFs lead to well-known methods such as trigonometric (Rahimi and Recht, 2007) or (recently introduced in the context of linear-attention Transformers) positive random features (Choromanski et al., 2021). By generalizing Bochner's Theorem for softmax/Gaussian kernels and leveraging random features for compositional kernels, the HRF-mechanism provides strong theoretical guarantees - unbiased approximation and strictly smaller worst-case relative errors than its counterparts. We conduct exhaustive empirical evaluation of HRF ranging from pointwise kernel estimation experiments, through tests on data admitting clustering structure to benchmarking implicit-attention Transformers (also for downstream Robotics applications), demonstrating its quality in a wide spectrum of machine learning problems.

View on arXiv PDF Code

Similar