MLCVLGMar 9, 2023

Kernel Regression with Infinite-Width Neural Networks on Millions of Examples

DeepMind
arXiv:2303.05420v112 citationsh-index: 37
Originality Incremental advance
AI Analysis

This work addresses the scalability problem for researchers and practitioners using neural kernels on large datasets, though it is incremental as it builds on existing kernel methods with improved efficiency.

The authors tackled the high computational cost of neural kernels by massively parallelizing their computation across GPUs and using a distributed algorithm, enabling kernel regression on up to five million examples and achieving a state-of-the-art test accuracy of 91.2% on the CIFAR-5m dataset.

Neural kernels have drastically increased performance on diverse and nonstandard data modalities but require significantly more compute, which previously limited their application to smaller datasets. In this work, we address this by massively parallelizing their computation across many GPUs. We combine this with a distributed, preconditioned conjugate gradients algorithm to enable kernel regression at a large scale (i.e. up to five million examples). Using this approach, we study scaling laws of several neural kernels across many orders of magnitude for the CIFAR-5m dataset. Using data augmentation to expand the original CIFAR-10 training dataset by a factor of 20, we obtain a test accuracy of 91.2\% (SotA for a pure kernel method). Moreover, we explore neural kernels on other data modalities, obtaining results on protein and small molecule prediction tasks that are competitive with SotA methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes