LGAug 1, 2022
XOOD: Extreme Value Based Out-Of-Distribution Detection For Image ClassificationFrej Berglind, Haron Temam, Supratik Mukhopadhyay et al.
Detecting out-of-distribution (OOD) data at inference time is crucial for many applications of machine learning. We present XOOD: a novel extreme value-based OOD detection framework for image classification that consists of two algorithms. The first, XOOD-M, is completely unsupervised, while the second XOOD-L is self-supervised. Both algorithms rely on the signals captured by the extreme values of the data in the activation layers of the neural network in order to distinguish between in-distribution and OOD instances. We show experimentally that both XOOD-M and XOOD-L outperform state-of-the-art OOD detection methods on many benchmark data sets in both efficiency and accuracy, reducing false-positive rate (FPR95) by 50%, while improving the inferencing time by an order of magnitude.
LGFeb 5Code
Benchmarking Artificial Intelligence Models for Daily Coastal Hypoxia ForecastingMagesh Rajasekaran, Md Saiful Sajol, Chris Alvin et al.
Coastal hypoxia, especially in the northern part of Gulf of Mexico, presents a persistent ecological and economic concern. Seasonal models offer coarse forecasts that miss the fine-scale variability needed for daily, responsive ecosystem management. We present study that compares four deep learning architectures for daily hypoxia classification: Bidirectional Long Short-Term Memory (BiLSTM), Medformer (Medical Transformer), Spatio-Temporal Transformer (ST-Transformer), and Temporal Convolutional Network (TCN). We trained our models with twelve years of daily hindcast data from 2009-2020 Our training data consists of 2009-2020 hindcast data from a coupled hydrodynamic-biogeochemical model. Similarly, we use hindcast data from 2020 through 2024 as a test data. We constructed classification models incorporating water column stratification, sediment oxygen consumption, and temperature-dependent decomposition rates. We evaluated each architectures using the same data preprocessing, input/output formulation, and validation protocols. Each model achieved high classification accuracy and strong discriminative ability with ST-Transformer achieving the highest performance across all metrics and tests periods (AUC-ROC: 0.982-0.992). We also employed McNemar's method to identify statistically significant differences in model predictions. Our contribution is a reproducible framework for operational real-time hypoxia prediction that can support broader efforts in the environmental and ocean modeling systems community and in ecosystem resilience. The source code is available https://github.com/rmagesh148/hypoxia-ai/
LGMay 14, 2024Code
Risks and Opportunities of Open-Source Generative AIFrancisco Eiras, Aleksandar Petrov, Bertie Vidgen et al.
Applications of Generative AI (Gen AI) are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about the potential risks of the technology, and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation is likely to put at risk the budding field of open-source generative AI. Using a three-stage framework for Gen AI development (near, mid and long-term), we analyze the risks and opportunities of open-source generative AI models with similar capabilities to the ones currently available (near to mid-term) and with greater capabilities (long-term). We argue that, overall, the benefits of open-source Gen AI outweigh its risks. As such, we encourage the open sourcing of models, training and evaluation data, and provide a set of recommendations and best practices for managing risks associated with open-source generative AI.
LGApr 25, 2024Code
Near to Mid-term Risks and Opportunities of Open-Source Generative AIFrancisco Eiras, Aleksandar Petrov, Bertie Vidgen et al.
In the next few years, applications of Generative AI are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about potential risks and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation is likely to put at risk the budding field of open-source Generative AI. We argue for the responsible open sourcing of generative AI models in the near and medium term. To set the stage, we first introduce an AI openness taxonomy system and apply it to 40 current large language models. We then outline differential benefits and risks of open versus closed source AI and present potential risk mitigation, ranging from best practices to calls for technical and scientific contributions. We hope that this report will add a much needed missing voice to the current public discourse on near to mid-term AI safety and other societal impact.
LGOct 24, 2025
A Multimodal Human Protein Embeddings Database: DeepDrug Protein Embeddings Bank (DPEB)Md Saiful Islam Sajol, Magesh Rajasekaran, Hayden Gemeinhardt et al.
Computationally predicting protein-protein interactions (PPIs) is challenging due to the lack of integrated, multimodal protein representations. DPEB is a curated collection of 22,043 human proteins that integrates four embedding types: structural (AlphaFold2), transformer-based sequence (BioEmbeddings), contextual amino acid patterns (ESM-2: Evolutionary Scale Modeling), and sequence-based n-gram statistics (ProtVec]). AlphaFold2 protein structures are available through public databases (e.g., AlphaFold2 Protein Structure Database), but the internal neural network embeddings are not. DPEB addresses this gap by providing AlphaFold2-derived embeddings for computational modeling. Our benchmark evaluations show GraphSAGE with BioEmbedding achieved the highest PPI prediction performance (87.37% AUROC, 79.16% accuracy). The framework also achieved 77.42% accuracy for enzyme classification and 86.04% accuracy for protein family classification. DPEB supports multiple graph neural network methods for PPI prediction, enabling applications in systems biology, drug target identification, pathway analysis, and disease mechanism studies.
IVJan 23, 2020
A One-Shot Learning Framework for Assessment of Fibrillar Collagen from Second Harmonic Generation Images of an Infarcted MyocardiumQun Liu, Supratik Mukhopadhyay, Maria Ximena Bastidas Rodriguez et al.
Myocardial infarction (MI) is a scientific term that refers to heart attack. In this study, we infer highly relevant second harmonic generation (SHG) cues from collagen fibers exhibiting highly non-centrosymmetric assembly together with two-photon excited cellular autofluorescence in infarcted mouse heart to quantitatively probe fibrosis, especially targeted at an early stage after MI. We present a robust one-shot machine learning algorithm that enables determination of 2D assembly of collagen with high spatial resolution along with its structural arrangement in heart tissues post-MI with spectral specificity and sensitivity. Detection, evaluation, and precise quantification of fibrosis extent at early stage would guide one to develop treatment therapies that may prevent further progression and determine heart transplant needs for patient survival.
AIJan 7, 2020
Context-Aware Design of Cyber-Physical Human Systems (CPHS)Supratik Mukhopadhyay, Qun Liu, Edward Collier et al.
Recently, it has been widely accepted by the research community that interactions between humans and cyber-physical infrastructures have played a significant role in determining the performance of the latter. The existing paradigm for designing cyber-physical systems for optimal performance focuses on developing models based on historical data. The impacts of context factors driving human system interaction are challenging and are difficult to capture and replicate in existing design models. As a result, many existing models do not or only partially address those context factors of a new design owing to the lack of capabilities to capture the context factors. This limitation in many existing models often causes performance gaps between predicted and measured results. We envision a new design environment, a cyber-physical human system (CPHS) where decision-making processes for physical infrastructures under design are intelligently connected to distributed resources over cyberinfrastructure such as experiments on design features and empirical evidence from operations of existing instances. The framework combines existing design models with context-aware design-specific data involving human-infrastructure interactions in new designs, using a machine learning approach to create augmented design models with improved predictive powers.
CVNov 15, 2019
DeepSat V2: Feature Augmented Convolutional Neural Nets for Satellite Image ClassificationQun Liu, Saikat Basu, Sangram Ganguly et al.
Satellite image classification is a challenging problem that lies at the crossroads of remote sensing, computer vision, and machine learning. Due to the high variability inherent in satellite data, most of the current object classification approaches are not suitable for handling satellite datasets. The progress of satellite image analytics has also been inhibited by the lack of a single labeled high-resolution dataset with multiple class labels. In a preliminary version of this work, we introduced two new high resolution satellite imagery datasets (SAT-4 and SAT-6) and proposed DeepSat framework for classification based on "handcrafted" features and a deep belief network (DBN). The present paper is an extended version, we present an end-to-end framework leveraging an improved architecture that augments a convolutional neural network (CNN) with handcrafted features (instead of using DBN-based architecture) for classification. Our framework, having access to fused spatial information obtained from handcrafted features as well as CNN feature maps, have achieved accuracies of 99.90% and 99.84% respectively, on SAT-4 and SAT-6, surpassing all the other state-of-the-art results. A statistical analysis based on Distribution Separability Criterion substantiates the robustness of our approach in learning better representations for satellite imagery.
CVAug 11, 2019
PCGAN-CHAR: Progressively Trained Classifier Generative Adversarial Networks for Classification of Noisy Handwritten Bangla CharactersQun Liu, Edward Collier, Supratik Mukhopadhyay
Due to the sparsity of features, noise has proven to be a great inhibitor in the classification of handwritten characters. To combat this, most techniques perform denoising of the data before classification. In this paper, we consolidate the approach by training an all-in-one model that is able to classify even noisy characters. For classification, we progressively train a classifier generative adversarial network on the characters from low to high resolution. We show that by learning the features at each resolution independently a trained model is able to accurately classify characters even in the presence of noise. We experimentally demonstrate the effectiveness of our approach by classifying noisy versions of MNIST, handwritten Bangla Numeral, and Basic Character datasets.
LGJun 13, 2019
Improving Prediction Accuracy in Building Performance Models Using Generative Adversarial Networks (GANs)Chanachok Chokwitthaya, Edward Collier, Yimin Zhu et al.
Building performance discrepancies between building design and operation are one of the causes that lead many new designs fail to achieve their goals and objectives. One of main factors contributing to the discrepancy is occupant behaviors. Occupants responding to a new design are influenced by several factors. Existing building performance models (BPMs) ignore or partially address those factors (called contextual factors) while developing BPMs. To potentially reduce the discrepancies and improve the prediction accuracy of BPMs, this paper proposes a computational framework for learning mixture models by using Generative Adversarial Networks (GANs) that appropriately combining existing BPMs with knowledge on occupant behaviors to contextual factors in new designs. Immersive virtual environments (IVEs) experiments are used to acquire data on such behaviors. Performance targets are used to guide appropriate combination of existing BPMs with knowledge on occupant behaviors. The resulting model obtained is called an augmented BPM. Two different experiments related to occupant lighting behaviors are shown as case study. The results reveal that augmented BPMs significantly outperformed existing BPMs with respect to achieving specified performance targets. The case study confirms the potential of the computational framework for improving prediction accuracy of BPMs during design.
LGMar 27, 2019
Improving Route Choice Models by Incorporating Contextual Factors via Knowledge DistillationQun Liu, Supratik Mukhopadhyay, Yimin Zhu et al.
Route Choice Models predict the route choices of travelers traversing an urban area. Most of the route choice models link route characteristics of alternative routes to those chosen by the drivers. The models play an important role in prediction of traffic levels on different routes and thus assist in development of efficient traffic management strategies that result in minimizing traffic delay and maximizing effective utilization of transport system. High fidelity route choice models are required to predict traffic levels with higher accuracy. Existing route choice models do not take into account dynamic contextual conditions such as the occurrence of an accident, the socio-cultural and economic background of drivers, other human behaviors, the dynamic personal risk level, etc. As a result, they can only make predictions at an aggregate level and for a fixed set of contextual factors. For higher fidelity, it is highly desirable to use a model that captures significance of subjective or contextual factors in route choice. This paper presents a novel approach for developing high-fidelity route choice models with increased predictive power by augmenting existing aggregate level baseline models with information on drivers' responses to contextual factors obtained from Stated Choice Experiments carried out in an Immersive Virtual Environment through the use of knowledge distillation.
CVFeb 12, 2019
Progressively Growing Generative Adversarial Networks for High Resolution Semantic Segmentation of Satellite ImagesEdward Collier, Kate Duffy, Sangram Ganguly et al.
Machine learning has proven to be useful in classification and segmentation of images. In this paper, we evaluate a training methodology for pixel-wise segmentation on high resolution satellite images using progressive growing of generative adversarial networks. We apply our model to segmenting building rooftops and compare these results to conventional methods for rooftop segmentation. We present our findings using the SpaceNet version 2 dataset. Progressive GAN training achieved a test accuracy of 93% compared to 89% for traditional GAN training.
CVJun 21, 2018
Pixel-level Reconstruction and Classification for Noisy Handwritten Bangla CharactersManohar Karki, Qun Liu, Robert DiBiano et al.
Classification techniques for images of handwritten characters are susceptible to noise. Quadtrees can be an efficient representation for learning from sparse features. In this paper, we improve the effectiveness of probabilistic quadtrees by using a pixel level classifier to extract the character pixels and remove noise from handwritten character images. The pixel level denoiser (a deep belief network) uses the map responses obtained from a pretrained CNN as features for reconstructing the characters eliminating noise. We experimentally demonstrate the effectiveness of our approach by reconstructing and classifying a noisy version of handwritten Bangla Numeral and Basic Character datasets.
CVMay 2, 2018
Unsupervised Learning using Pretrained CNN and Associative Memory BankQun Liu, Supratik Mukhopadhyay
Deep Convolutional features extracted from a comprehensive labeled dataset, contain substantial representations which could be effectively used in a new domain. Despite the fact that generic features achieved good results in many visual tasks, fine-tuning is required for pretrained deep CNN models to be more effective and provide state-of-the-art performance. Fine tuning using the backpropagation algorithm in a supervised setting, is a time and resource consuming process. In this paper, we present a new architecture and an approach for unsupervised object recognition that addresses the above mentioned problem with fine tuning associated with pretrained CNN-based supervised deep learning approaches while allowing automated feature extraction. Unlike existing works, our approach is applicable to general object recognition tasks. It uses a pretrained (on a related domain) CNN model for automated feature extraction pipelined with a Hopfield network based associative memory bank for storing patterns for classification purposes. The use of associative memory bank in our framework allows eliminating backpropagation while providing competitive performance on an unseen dataset.
LGApr 20, 2018
CactusNets: Layer Applicability as a Metric for Transfer LearningEdward Collier, Robert DiBiano, Supratik Mukhopadhyay
Deep neural networks trained over large datasets learn features that are both generic to the whole dataset, and specific to individual classes in the dataset. Learned features tend towards generic in the lower layers and specific in the higher layers of a network. Methods like fine-tuning are made possible because of the ability for one filter to apply to multiple target classes. Much like the human brain this behavior, can also be used to cluster and separate classes. However, to the best of our knowledge there is no metric for how applicable learned features are to specific classes. In this paper we propose a definition and metric for measuring the applicability of learned features to individual classes, and use this applicability metric to estimate input applicability and produce a new method of unsupervised learning we call the CactusNet.
CVDec 6, 2016
Core Sampling Framework for Pixel ClassificationManohar Karki, Robert DiBiano, Saikat Basu et al.
The intermediate map responses of a Convolutional Neural Network (CNN) contain information about an image that can be used to extract contextual knowledge about it. In this paper, we present a core sampling framework that is able to use these activation maps from several layers as features to another neural network using transfer learning to provide an understanding of an input image. Our framework creates a representation that combines features from the test data and the contextual knowledge gained from the responses of a pretrained network, processes it and feeds it to a separate Deep Belief Network. We use this representation to extract more information from an image at the pixel level, hence gaining understanding of the whole image. We experimentally demonstrate the usefulness of our framework using a pretrained VGG-16 model to perform segmentation on the BAERI dataset of Synthetic Aperture Radar(SAR) imagery and the CAMVID dataset.
CVMay 9, 2016
A Theoretical Analysis of Deep Neural Networks for Texture ClassificationSaikat Basu, Manohar Karki, Robert DiBiano et al.
We investigate the use of Deep Neural Networks for the classification of image datasets where texture features are important for generating class-conditional discriminative representations. To this end, we first derive the size of the feature space for some standard textural features extracted from the input dataset and then use the theory of Vapnik-Chervonenkis dimension to show that hand-crafted feature extraction creates low-dimensional representations which help in reducing the overall excess error rate. As a corollary to this analysis, we derive for the first time upper bounds on the VC dimension of Convolutional Neural Network as well as Dropout and Dropconnect networks and the relation between excess error rate of Dropout and Dropconnect networks. The concept of intrinsic dimension is used to validate the intuition that texture-based datasets are inherently higher dimensional as compared to handwritten digits or other object recognition datasets and hence more difficult to be shattered by neural networks. We then derive the mean distance from the centroid to the nearest and farthest sampling points in an n-dimensional manifold and show that the Relative Contrast of the sample data vanishes as dimensionality of the underlying vector space tends to infinity.
AIOct 29, 2015
Automatic Synthesis of Geometry Problems for an Intelligent Tutoring SystemChris Alvin, Sumit Gulwani, Rupak Majumdar et al.
This paper presents an intelligent tutoring system, GeoTutor, for Euclidean Geometry that is automatically able to synthesize proof problems and their respective solutions given a geometric figure together with a set of properties true of it. GeoTutor can provide personalized practice problems that address student deficiencies in the subject matter.
CVSep 11, 2015
DeepSat - A Learning framework for Satellite ImagerySaikat Basu, Sangram Ganguly, Supratik Mukhopadhyay et al.
Satellite image classification is a challenging problem that lies at the crossroads of remote sensing, computer vision, and machine learning. Due to the high variability inherent in satellite data, most of the current object classification approaches are not suitable for handling satellite datasets. The progress of satellite image analytics has also been inhibited by the lack of a single labeled high-resolution dataset with multiple class labels. The contributions of this paper are twofold - (1) first, we present two new satellite datasets called SAT-4 and SAT-6, and (2) then, we propose a classification framework that extracts features from an input image, normalizes them and feeds the normalized feature vectors to a Deep Belief Network for classification. On the SAT-4 dataset, our best network produces a classification accuracy of 97.95% and outperforms three state-of-the-art object recognition algorithms, namely - Deep Belief Networks, Convolutional Neural Networks and Stacked Denoising Autoencoders by ~11%. On SAT-6, it produces a classification accuracy of 93.9% and outperforms the other algorithms by ~15%. Comparative studies with a Random Forest classifier show the advantage of an unsupervised learning approach over traditional supervised learning techniques. A statistical analysis based on Distribution Separability Criterion and Intrinsic Dimensionality Estimation substantiates the effectiveness of our approach in learning better representations for satellite imagery.
CVSep 11, 2015
Learning Sparse Feature Representations using Probabilistic Quadtrees and Deep Belief NetsSaikat Basu, Manohar Karki, Sangram Ganguly et al.
Learning sparse feature representations is a useful instrument for solving an unsupervised learning problem. In this paper, we present three labeled handwritten digit datasets, collectively called n-MNIST. Then, we propose a novel framework for the classification of handwritten digits that learns sparse representations using probabilistic quadtrees and Deep Belief Nets. On the MNIST and n-MNIST datasets, our framework shows promising results and significantly outperforms traditional Deep Belief Networks.