LGSep 23, 2022
Jensen-Shannon Divergence Based Novel Loss Functions for Bayesian Neural NetworksPonkrshnan Thiagarajan, Susanta Ghosh
Bayesian neural networks (BNNs) are state-of-the-art machine learning methods that can naturally regularize and systematically quantify uncertainties using their stochastic parameters. Kullback-Leibler (KL) divergence-based variational inference used in BNNs suffers from unstable optimization and challenges in approximating light-tailed posteriors due to the unbounded nature of the KL divergence. To resolve these issues, we formulate a novel loss function for BNNs based on a new modification to the generalized Jensen-Shannon (JS) divergence, which is bounded. In addition, we propose a Geometric JS divergence-based loss, which is computationally efficient since it can be evaluated analytically. We found that the JS divergence-based variational inference is intractable, and hence employed a constrained optimization framework to formulate these losses. Our theoretical analysis and empirical experiments on multiple regression and classification data sets suggest that the proposed losses perform better than the KL divergence-based loss, especially when the data sets are noisy or biased. Specifically, there are approximately 5% and 8% improvements in accuracy for a noise-added CIFAR-10 dataset and a regression dataset, respectively. There is about a 13% reduction in false negative predictions of a biased histopathology dataset. In addition, we quantify and compare the uncertainty metrics for the regression and classification tasks.
MTRL-SCIJul 11, 2025
Surprisingly High Redundancy in Electronic Structure DataSazzad Hossain, Ponkrshnan Thiagarajan, Shashank Pathrudkar et al.
Machine Learning (ML) models for electronic structure rely on large datasets generated through expensive Kohn-Sham Density Functional Theory simulations. This study reveals a surprisingly high level of redundancy in such datasets across various material systems, including molecules, simple metals, and complex alloys. Our findings challenge the prevailing assumption that large, exhaustive datasets are necessary for accurate ML predictions of electronic structure. We demonstrate that even random pruning can substantially reduce dataset size with minimal loss in predictive accuracy, while a state-of-the-art coverage-based pruning strategy retains chemical accuracy and model generalizability using up to 100-fold less data and reducing training time by threefold or more. By contrast, widely used importance-based pruning methods, which eliminate seemingly redundant data, can catastrophically fail at higher pruning factors, possibly due to the significant reduction in data coverage. This heretofore unexplored high degree of redundancy in electronic structure data holds the potential to identify a minimal, essential dataset representative of each material class.
LGMay 9, 2024
Gradient Flow Based Phase-Field Modeling Using Separable Neural NetworksRevanth Mattey, Susanta Ghosh
The $L^2$ gradient flow of the Ginzburg-Landau free energy functional leads to the Allen Cahn equation that is widely used for modeling phase separation. Machine learning methods for solving the Allen-Cahn equation in its strong form suffer from inaccuracies in collocation techniques, errors in computing higher-order spatial derivatives through automatic differentiation, and the large system size required by the space-time approach. To overcome these limitations, we propose a separable neural network-based approximation of the phase field in a minimizing movement scheme to solve the aforementioned gradient flow problem. At each time step, the separable neural network is used to approximate the phase field in space through a low-rank tensor decomposition thereby accelerating the derivative calculations. The minimizing movement scheme naturally allows for the use of Gauss quadrature technique to compute the functional. A `$tanh$' transformation is applied on the neural network-predicted phase field to strictly bounds the solutions within the values of the two phases. For this transformation, a theoretical guarantee for energy stability of the minimizing movement scheme is established. Our results suggest that bounding the solution through this transformation is the key to effectively model sharp interfaces through separable neural network. The proposed method outperforms the state-of-the-art machine learning methods for phase separation problems and is an order of magnitude faster than the finite element method.
CVOct 7, 2020
Explanation and Use of Uncertainty Quantified by Bayesian Neural Network Classifiers for Breast Histopathology ImagesPonkrshnan Thiagarajan, Pushkar Khairnar, Susanta Ghosh
Despite the promise of Convolutional neural network (CNN) based classification models for histopathological images, it is infeasible to quantify its uncertainties. Moreover, CNNs may suffer from overfitting when the data is biased. We show that Bayesian-CNN can overcome these limitations by regularizing automatically and by quantifying the uncertainty. We have developed a novel technique to utilize the uncertainties provided by the Bayesian-CNN that significantly improves the performance on a large fraction of the test data (about 6% improvement in accuracy on 77% of test data). Further, we provide a novel explanation for the uncertainty by projecting the data into a low dimensional space through a nonlinear dimensionality reduction technique. This dimensionality reduction enables interpretation of the test data through visualization and reveals the structure of the data in a low dimensional feature space. We show that the Bayesian-CNN can perform much better than the state-of-the-art transfer learning CNN (TL-CNN) by reducing the false negative and false positive by 11% and 7.7% respectively for the present data set. It achieves this performance with only 1.86 million parameters as compared to 134.33 million for TL-CNN. Besides, we modify the Bayesian-CNN by introducing a stochastic adaptive activation function. The modified Bayesian-CNN performs slightly better than Bayesian-CNN on all performance metrics and significantly reduces the number of false negatives and false positives (3% reduction for both). We also show that these results are statistically significant by performing McNemar's statistical significance test. This work shows the advantages of Bayesian-CNN against the state-of-the-art, explains and utilizes the uncertainties for histopathological images. It should find applications in various medical image classifications.