Set Based Stochastic Subsampling
This addresses data efficiency and scalability issues for deep learning practitioners, though it is incremental as it builds on existing subsampling and attention techniques.
The paper tackles the problem of reducing data volume for deep models by proposing a two-stage neural subsampling method that outperforms baselines at low subsampling rates in tasks like image classification and reconstruction, and enhances scalability for nonparametric models like Neural Processes.
Deep models are designed to operate on huge volumes of high dimensional data such as images. In order to reduce the volume of data these models must process, we propose a set-based two-stage end-to-end neural subsampling model that is jointly optimized with an \textit{arbitrary} downstream task network (e.g. classifier). In the first stage, we efficiently subsample \textit{candidate elements} using conditionally independent Bernoulli random variables by capturing coarse grained global information using set encoding functions, followed by conditionally dependent autoregressive subsampling of the candidate elements using Categorical random variables by modeling pair-wise interactions using set attention networks in the second stage. We apply our method to feature and instance selection and show that it outperforms the relevant baselines under low subsampling rates on a variety of tasks including image classification, image reconstruction, function reconstruction and few-shot classification. Additionally, for nonparametric models such as Neural Processes that require to leverage the whole training data at inference time, we show that our method enhances the scalability of these models.