CVMar 25, 2022
Digital Fingerprinting of MicrostructuresMichael D. White, Alexander Tarakanov, Christopher P. Race et al.
Finding efficient means of fingerprinting microstructural information is a critical step towards harnessing data-centric machine learning approaches. A statistical framework is systematically developed for compressed characterisation of a population of images, which includes some classical computer vision methods as special cases. The focus is on materials microstructure. The ultimate purpose is to rapidly fingerprint sample images in the context of various high-throughput design/make/test scenarios. This includes, but is not limited to, quantification of the disparity between microstructures for quality control, classifying microstructures, predicting materials properties from image data and identifying potential processing routes to engineer new materials with specific properties. Here, we consider microstructure classification and utilise the resulting features over a range of related machine learning tasks, namely supervised, semi-supervised, and unsupervised learning. The approach is applied to two distinct datasets to illustrate various aspects and some recommendations are made based on the findings. In particular, methods that leverage transfer learning with convolutional neural networks (CNNs), pretrained on the ImageNet dataset, are generally shown to outperform other methods. Additionally, dimensionality reduction of these CNN-based fingerprints is shown to have negligible impact on classification accuracy for the supervised learning approaches considered. In situations where there is a large dataset with only a handful of images labelled, graph-based label propagation to unlabelled data is shown to be favourable over discarding unlabelled data and performing supervised learning. In particular, label propagation by Poisson learning is shown to be highly effective at low label rates.
NAJun 28, 2016
Multilevel ensemble Kalman filteringHåkon Hoel, Kody J. H. Law, Raul Tempone
This work embeds a multilevel Monte Carlo sampling strategy into the Monte Carlo step of the ensemble Kalman filter (EnKF) in the setting of finite dimensional signal evolution and noisy discrete-time observations. The signal dynamics is assumed to be governed by a stochastic differential equation (SDE), and a hierarchy of time grids is introduced for multilevel numerical integration of that SDE. The resulting multilevel EnKF is proved to asymptotically outperform EnKF in terms of computational cost versus approximation accuracy. The theoretical results are illustrated numerically.
PRJun 28, 2016
Deterministic Mean-field Ensemble Kalman FilteringKody J. H. Law, Hamidou Tembine, Raul Tempone
The proof of convergence of the standard ensemble Kalman filter (EnKF) from Legland etal. (2011) is extended to non-Gaussian state space models. A density-based deterministic approximation of the mean-field limit EnKF (DMFEnKF) is proposed, consisting of a PDE solver and a quadrature rule. Given a certain minimal order of convergence $κ$ between the two, this extends to the deterministic filter approximation, which is therefore asymptotically superior to standard EnKF when the dimension $d<2κ$. The fidelity of approximation of the true distribution is also established using an extension of total variation metric to random measures. This is limited by a Gaussian bias term arising from non-linearity/non-Gaussianity of the model, which exists for both DMFEnKF and standard EnKF. Numerical results support and extend the theory.
MLFeb 3, 2024Code
Accelerating Look-ahead in Bayesian Optimization: Multilevel Monte Carlo is All you NeedShangda Yang, Vitaly Zankin, Maximilian Balandat et al.
We leverage multilevel Monte Carlo (MLMC) to improve the performance of multi-step look-ahead Bayesian optimization (BO) methods that involve nested expectations and maximizations. Often these expectations must be computed by Monte Carlo (MC). The complexity rate of naive MC degrades for nested operations, whereas MLMC is capable of achieving the canonical MC convergence rate for this type of problem, independently of dimension and without any smoothness assumptions. Our theoretical study focuses on the approximation improvements for twoand three-step look-ahead acquisition functions, but, as we discuss, the approach is generalizable in various ways, including beyond the context of BO. Our findings are verified numerically and the benefits of MLMC for BO are illustrated on several benchmark examples. Code is available at https://github.com/Shangda-Yang/MLMCBO .
MLFeb 9, 2024
Comparison of parallel SMC and MCMC for Bayesian deep learningXinzhu Liang, Joseph M. Lukens, Sanjaya Lohani et al.
This work systematically compares parallel implementations of consistent (asymptotically unbiased) Bayesian deep learning algorithms: sequential Monte Carlo sampler (SMC$_\parallel$) or Markov chain Monte Carlo (MCMC$_\parallel$). We provide a proof of convergence for SMC$_\parallel$ showing that it theoretically achieves the same level of convergence as a single monolithic SMC sampler, while the reduced communication lowers wall-clock time. It is well-known that the first samples from MCMC need to be discarded to eliminate initialization bias, and that the number of discarded samples must grow like the logarithm of the number of parallel chains to control that bias for MCMC$_\parallel$. A systematic empirical numerical study on MNIST, CIFAR, and IMDb, reveals that parallel implementations of both methods perform comparably to non-parallel implementations in terms of performance and total cost, and also comparably to each other. However, both methods still require a large wall-clock time, and suffer from catastrophic non-convergence if they aren't run for long enough.
MLMay 19, 2025
Scalable Bayesian Monte Carlo: fast uncertainty estimation beyond deep ensemblesXinzhu Liang, Joseph M. Lukens, Sanjaya Lohani et al.
This work introduces a new method designed for Bayesian deep learning called scalable Bayesian Monte Carlo (SBMC). The method is comprised of a model and an algorithm. The model interpolates between a point estimator and the posterior. The algorithm is a parallel implementation of sequential Monte Carlo sampler (SMC$_\parallel$) or Markov chain Monte Carlo (MCMC$_\parallel$). We collectively refer to these consistent (asymptotically unbiased) algorithms as Bayesian Monte Carlo (BMC), and any such algorithm can be used in our SBMC method. The utility of the method is demonstrated on practical examples: MNIST, CIFAR, IMDb. A systematic numerical study reveals that for the same wall-clock time as state-of-the-art (SOTA) methods like deep ensembles (DE), SBMC achieves comparable or better accuracy and substantially improved uncertainty quantification (UQ)--in particular, epistemic UQ. This is demonstrated on the downstream task of estimating the confidence in predictions, which can be used for reliability assessment or abstention decisions.
COFeb 24, 2021
Sparse online variational Bayesian regressionKody J. H. Law, Vitaly Zankin
This work considers variational Bayesian inference as an inexpensive and scalable alternative to a fully Bayesian approach in the context of sparsity-promoting priors. In particular, the priors considered arise from scale mixtures of Normal distributions with a generalized inverse Gaussian mixing distribution. This includes the variational Bayesian LASSO as an inexpensive and scalable alternative to the Bayesian LASSO introduced in [65]. It also includes a family of priors which more strongly promote sparsity. For linear models the method requires only the iterative solution of deterministic least squares problems. Furthermore, for p unknown covariates the method can be implemented exactly online with a cost of $O(p^3)$ in computation and $O(p^2)$ in memory per iteration -- in other words, the cost per iteration is independent of n, and in principle infinite data can be considered. For large $p$ an approximation is able to achieve promising results for a cost of $O(p)$ per iteration, in both computation and memory. Strategies for hyper-parameter tuning are also considered. The method is implemented for real and simulated data. It is shown that the performance in terms of variable selection and uncertainty quantification of the variational Bayesian LASSO can be comparable to the Bayesian LASSO for problems which are tractable with that method, and for a fraction of the cost. The present method comfortably handles $n = 65536$, $p = 131073$ on a laptop in less than 30 minutes, and $n = 10^5$, $p = 2.1 \times 10^6$ overnight.
MTRL-SCIJan 14, 2021
Materials Fingerprinting ClassificationAdam Spannaus, Kody J. H. Law, Piotr Luszczek et al.
Significant progress in many classes of materials could be made with the availability of experimentally-derived large datasets composed of atomic identities and three-dimensional coordinates. Methods for visualizing the local atomic structure, such as atom probe tomography (APT), which routinely generate datasets comprised of millions of atoms, are an important step in realizing this goal. However, state-of-the-art APT instruments generate noisy and sparse datasets that provide information about elemental type, but obscure atomic structures, thus limiting their subsequent value for materials discovery. The application of a materials fingerprinting process, a machine learning algorithm coupled with topological data analysis, provides an avenue by which here-to-fore unprecedented structural information can be extracted from an APT dataset. As a proof of concept, the material fingerprint is applied to high-entropy alloy APT datasets containing body-centered cubic (BCC) and face-centered cubic (FCC) crystal structures. A local atomic configuration centered on an arbitrary atom is assigned a topological descriptor, with which it can be characterized as a BCC or FCC lattice with near perfect accuracy, despite the inherent noise in the dataset. This successful identification of a fingerprint is a crucial first step in the development of algorithms which can extract more nuanced information, such as chemical ordering, from existing datasets of complex materials.
LGMay 15, 2019
Cluster, Classify, Regress: A General Method For Learning Discountinous FunctionsDavid E. Bernholdt, Mark R. Cianciosa, Clement Etienam et al.
This paper presents a method for solving the supervised learning problem in which the output is highly nonlinear and discontinuous. It is proposed to solve this problem in three stages: (i) cluster the pairs of input-output data points, resulting in a label for each point; (ii) classify the data, where the corresponding label is the output; and finally (iii) perform one separate regression for each class, where the training data corresponds to the subset of the original input-output pairs which have that label according to the classifier. It has not yet been proposed to combine these 3 fundamental building blocks of machine learning in this simple and powerful fashion. This can be viewed as a form of deep learning, where any of the intermediate layers can itself be deep. The utility and robustness of the methodology is illustrated on some toy problems, including one example problem arising from simulation of plasma fusion in a tokamak.