LGNov 22, 2023

Adaptive Sampling for Deep Learning via Efficient Nonparametric Proxies

arXiv:2311.13583v1h-index: 32
Originality Incremental advance
AI Analysis

This work addresses the computational bottleneck in dynamic data sampling for neural network training, offering a more efficient method that improves training speed and accuracy, though it is incremental as it builds on existing sampling strategies.

The paper tackles the problem of inefficient dynamic sampling in deep learning by proposing a novel sampling distribution based on nonparametric kernel regression with an efficient sketch-based approximation, which outperforms baselines in wall-clock time and accuracy on four datasets.

Data sampling is an effective method to improve the training speed of neural networks, with recent results demonstrating that it can even break the neural scaling laws. These results critically rely on high-quality scores to estimate the importance of an input to the network. We observe that there are two dominant strategies: static sampling, where the scores are determined before training, and dynamic sampling, where the scores can depend on the model weights. Static algorithms are computationally inexpensive but less effective than their dynamic counterparts, which can cause end-to-end slowdown due to their need to explicitly compute losses. To address this problem, we propose a novel sampling distribution based on nonparametric kernel regression that learns an effective importance score as the neural network trains. However, nonparametric regression models are too computationally expensive to accelerate end-to-end training. Therefore, we develop an efficient sketch-based approximation to the Nadaraya-Watson estimator. Using recent techniques from high-dimensional statistics and randomized algorithms, we prove that our Nadaraya-Watson sketch approximates the estimator with exponential convergence guarantees. Our sampling algorithm outperforms the baseline in terms of wall-clock time and accuracy on four datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes