LG MLJun 23, 2021

Bayesian Deep Learning Hyperparameter Search for Robust Function Mapping to Polynomials with Noise

Nidhin Harilal, Udit Bhatia, Auroop R. Ganguly

arXiv:2106.12532v11.6

Originality Incremental advance

AI Analysis

This work addresses the challenge of hyperparameter selection in Bayesian Deep Learning for researchers and practitioners, but it is incremental as it builds on existing methods with new empirical insights.

The paper tackled the problem of designing Bayesian Deep Learning hyperparameters for robust function mapping with uncertainty quantification by mapping Bayesian neural networks to noise-contaminated polynomials, finding optimal network depth and ensemble size for prediction and uncertainty, but not for width.

Advances in neural architecture search, as well as explainability and interpretability of connectionist architectures, have been reported in the recent literature. However, our understanding of how to design Bayesian Deep Learning (BDL) hyperparameters, specifically, the depth, width and ensemble size, for robust function mapping with uncertainty quantification, is still emerging. This paper attempts to further our understanding by mapping Bayesian connectionist representations to polynomials of different orders with varying noise types and ratios. We examine the noise-contaminated polynomials to search for the combination of hyperparameters that can extract the underlying polynomial signals while quantifying uncertainties based on the noise attributes. Specifically, we attempt to study the question that an appropriate neural architecture and ensemble configuration can be found to detect a signal of any n-th order polynomial contaminated with noise having different distributions and signal-to-noise (SNR) ratios and varying noise attributes. Our results suggest the possible existence of an optimal network depth as well as an optimal number of ensembles for prediction skills and uncertainty quantification, respectively. However, optimality is not discernible for width, even though the performance gain reduces with increasing width at high values of width. Our experiments and insights can be directional to understand theoretical properties of BDL representations and to design practical solutions.

View on arXiv PDF

Similar