STMLOct 16, 2020

Quantile regression with deep ReLU Networks: Estimators and minimax rates

arXiv:2010.08236v543 citationsHas Code
Originality Highly original
AI Analysis

This work provides theoretical guarantees for quantile regression with neural networks, addressing a fundamental problem in statistics and machine learning for applications requiring robust estimation under heavy-tailed errors.

The paper tackles quantile regression using ReLU neural networks by deriving an upper bound on the expected mean squared error that depends on approximation error, network depth, and width, and shows tight bounds for compositions of Hölder functions and Besov spaces, achieving minimax rates for broad function classes and error distributions, with empirical simulations confirming practical performance.

Quantile regression is the task of estimating a specified percentile response, such as the median, from a collection of known covariates. We study quantile regression with rectified linear unit (ReLU) neural networks as the chosen model class. We derive an upper bound on the expected mean squared error of a ReLU network used to estimate any quantile conditional on a set of covariates. This upper bound only depends on the best possible approximation error, the number of layers in the network, and the number of nodes per layer. We further show upper bounds that are tight for two large classes of functions: compositions of Hölder functions and members of a Besov space. These tight bounds imply ReLU networks with quantile regression achieve minimax rates for broad collections of function types. Unlike existing work, the theoretical results hold under minimal assumptions and apply to general error distributions, including heavy-tailed distributions. Empirical simulations on a suite of synthetic response functions demonstrate the theoretical results translate to practical implementations of ReLU networks. Overall, the theoretical and empirical results provide insight into the strong performance of ReLU neural networks for quantile regression across a broad range of function classes and error distributions. All code for this paper is publicly available at https://github.com/tansey/quantile-regression.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes