Yuexi Wang

ML
4papers
40citations
Novelty45%
AI Score22

4 Papers

MENov 22, 2021
Approximate Bayesian Computation via Classification

Yuexi Wang, Tetsuya Kaji, Veronika Ročková

Approximate Bayesian Computation (ABC) enables statistical inference in simulator-based models whose likelihoods are difficult to calculate but easy to simulate from. ABC constructs a kernel-type approximation to the posterior distribution through an accept/reject mechanism which compares summary statistics of real and simulated data. To obviate the need for summary statistics, we directly compare empirical distributions with a Kullback-Leibler (KL) divergence estimator obtained via contrastive learning. In particular, we blend flexible machine learning classifiers within ABC to automate fake/real data comparisons. We consider the traditional accept/reject kernel as well as an exponential weighting scheme which does not require the ABC acceptance threshold. Our theoretical results show that the rate at which our ABC posterior distributions concentrate around the true parameter depends on the estimation error of the classifier. We derive limiting posterior shape results and find that, with a properly scaled exponential kernel, asymptotic normality holds. We demonstrate the usefulness of our approach on simulated examples as well as real data in the context of stock volatility estimation.

STFeb 26, 2020
Uncertainty Quantification for Sparse Deep Learning

Yuexi Wang, Veronika Ročková

Deep learning methods continue to have a decided impact on machine learning, both in theory and in practice. Statistical theoretical developments have been mostly concerned with approximability or rates of estimation when recovering infinite dimensional objects (curves or densities). Despite the impressive array of available theoretical results, the literature has been largely silent about uncertainty quantification for deep learning. This paper takes a step forward in this important direction by taking a Bayesian point of view. We study Gaussian approximability of certain aspects of posterior distributions of sparse deep ReLU architectures in non-parametric regression. Building on tools from Bayesian non-parametrics, we provide semi-parametric Bernstein-von Mises theorems for linear and quadratic functionals, which guarantee that implied Bayesian credible regions have valid frequentist coverage. Our results provide new theoretical justifications for (Bayesian) deep learning with ReLU activation functions, highlighting their inferential potential.

MLMar 22, 2019
Data Augmentation for Bayesian Deep Learning

Yuexi Wang, Nicholas G. Polson, Vadim O. Sokolov

Deep Learning (DL) methods have emerged as one of the most powerful tools for functional approximation and prediction. While the representation properties of DL have been well studied, uncertainty quantification remains challenging and largely unexplored. Data augmentation techniques are a natural approach to provide uncertainty quantification and to incorporate stochastic Monte Carlo search into stochastic gradient descent (SGD) methods. The purpose of our paper is to show that training DL architectures with data augmentation leads to efficiency gains. We use the theory of scale mixtures of normals to derive data augmentation strategies for deep learning. This allows variants of the expectation-maximization and MCMC algorithms to be brought to bear on these high dimensional nonlinear deep learning models. To demonstrate our methodology, we develop data augmentation algorithms for a variety of commonly used activation functions: logit, ReLU, leaky ReLU and SVM. Our methodology is compared to traditional stochastic gradient descent with back-propagation. Our optimization procedure leads to a version of iteratively re-weighted least squares and can be implemented at scale with accelerated linear algebra methods providing substantial improvement in speed. We illustrate our methodology on a number of standard datasets. Finally, we conclude with directions for future research.

MLSep 1, 2017
Sparse Regularization in Marketing and Economics

Guanhao Feng, Nicholas Polson, Yuexi Wang et al.

Sparse alpha-norm regularization has many data-rich applications in Marketing and Economics. Alpha-norm, in contrast to lasso and ridge regularization, jumps to a sparse solution. This feature is attractive for ultra high-dimensional problems that occur in demand estimation and forecasting. The alpha-norm objective is nonconvex and requires coordinate descent and proximal operators to find the sparse solution. We study a typical marketing demand forecasting problem, grocery store sales for salty snacks, that has many dummy variables as controls. The key predictors of demand include price, equivalized volume, promotion, flavor, scent, and brand effects. By comparing with many commonly used machine learning methods, alpha-norm regularization achieves its goal of providing accurate out-of-sample estimates for the promotion lift effects. Finally, we conclude with directions for future research.