ML LG COMay 14, 2024

Scalable Subsampling Inference for Deep Neural Networks

arXiv:2405.08276v19.25 citationsh-index: 45ACM IMS J Data Sci

Originality Incremental advance

AI Analysis

This work addresses the need for efficient and reliable inference methods in machine learning applications using deep neural networks, though it is incremental as it builds on existing error bounds and subsampling techniques.

The paper tackles the problem of statistical inference for deep neural networks by proposing a scalable subsampling technique to construct a 'subagged' DNN estimator, which is computationally efficient and maintains accuracy for estimation and prediction, while also enabling the construction of asymptotically valid confidence and prediction intervals that work well in finite samples.

Deep neural networks (DNN) has received increasing attention in machine learning applications in the last several years. Recently, a non-asymptotic error bound has been developed to measure the performance of the fully connected DNN estimator with ReLU activation functions for estimating regression models. The paper at hand gives a small improvement on the current error bound based on the latest results on the approximation ability of DNN. More importantly, however, a non-random subsampling technique--scalable subsampling--is applied to construct a `subagged' DNN estimator. Under regularity conditions, it is shown that the subagged DNN estimator is computationally efficient without sacrificing accuracy for either estimation or prediction tasks. Beyond point estimation/prediction, we propose different approaches to build confidence and prediction intervals based on the subagged DNN estimator. In addition to being asymptotically valid, the proposed confidence/prediction intervals appear to work well in finite samples. All in all, the scalable subsampling DNN estimator offers the complete package in terms of statistical inference, i.e., (a) computational efficiency; (b) point estimation/prediction accuracy; and (c) allowing for the construction of practically useful confidence and prediction intervals.

View on arXiv PDF

Similar