Rohitash Chandra

LG
h-index25
67papers
2,198citations
Novelty25%
AI Score50

67 Papers

LGApr 6, 2023
A review of ensemble learning and data augmentation models for class imbalanced problems: combination, implementation and evaluation

Azal Ahmad Khan, Omkar Chaudhari, Rohitash Chandra

Class imbalance (CI) in classification problems arises when the number of observations belonging to one class is lower than the other. Ensemble learning combines multiple models to obtain a robust model and has been prominently used with data augmentation methods to address class imbalance problems. In the last decade, a number of strategies have been added to enhance ensemble learning and data augmentation methods, along with new methods such as generative adversarial networks (GANs). A combination of these has been applied in many studies, and the evaluation of different combinations would enable a better understanding and guidance for different application domains. In this paper, we present a computational study to evaluate data augmentation and ensemble learning methods used to address prominent benchmark CI problems. We present a general framework that evaluates 9 data augmentation and 9 ensemble learning methods for CI problems. Our objective is to identify the most effective combination for improving classification performance on imbalanced datasets. The results indicate that combinations of data augmentation methods with ensemble learning can significantly improve classification performance on imbalanced datasets. We find that traditional data augmentation methods such as the synthetic minority oversampling technique (SMOTE) and random oversampling (ROS) are not only better in performance for selected CI problems, but also computationally less expensive than GANs. Our study is vital for the development of novel models for handling imbalanced datasets.

AIJun 24, 2023Code
A clustering and graph deep learning-based framework for COVID-19 drug repurposing

Chaarvi Bansal, Rohitash Chandra, Vinti Agarwal et al.

Drug repurposing (or repositioning) is the process of finding new therapeutic uses for drugs already approved by drug regulatory authorities (e.g., the Food and Drug Administration (FDA) and Therapeutic Goods Administration (TGA)) for other diseases. This involves analyzing the interactions between different biological entities, such as drug targets (genes/proteins and biological pathways) and drug properties, to discover novel drug-target or drug-disease relations. Artificial intelligence methods such as machine learning and deep learning have successfully analyzed complex heterogeneous data in the biomedical domain and have also been used for drug repurposing. This study presents a novel unsupervised machine learning framework that utilizes a graph-based autoencoder for multi-feature type clustering on heterogeneous drug data. The dataset consists of 438 drugs, of which 224 are under clinical trials for COVID-19 (category A). The rest are systematically filtered to ensure the safety and efficacy of the treatment (category B). The framework solely relies on reported drug data, including its pharmacological properties, chemical/physical properties, interaction with the host, and efficacy in different publicly available COVID-19 assays. Our machine-learning framework reveals three clusters of interest and provides recommendations featuring the top 15 drugs for COVID-19 drug repurposing, which were shortlisted based on the predicted clusters that were dominated by category A drugs. The anti-COVID efficacy of the drugs should be verified by experimental studies. Our framework can be extended to support other datasets and drug repurposing studies, given open-source code and data availability.

LGJan 26, 2023
Reef-insight: A framework for reef habitat mapping with clustering methods via remote sensing

Saharsh Barve, Jody M. Webster, Rohitash Chandra

Environmental damage has been of much concern, particularly in coastal areas and the oceans, given climate change and the drastic effects of pollution and extreme climate events. Our present-day analytical capabilities, along with advancements in information acquisition techniques such as remote sensing, can be utilised for the management and study of coral reef ecosystems. In this paper, we present Reef-Insight, an unsupervised machine learning framework that features advanced clustering methods and remote sensing for reef habitat mapping. Our framework compares different clustering methods for reef habitat mapping using remote sensing data. We evaluate four major clustering approaches based on qualitative and visual assessments which include k-means, hierarchical clustering, Gaussian mixture model, and density-based clustering. We utilise remote sensing data featuring the One Tree Island reef in Australia's Southern Great Barrier Reef. Our results indicate that clustering methods using remote sensing data can well identify benthic and geomorphic clusters in reefs when compared with other studies. Our results indicate that Reef-Insight can generate detailed reef habitat maps outlining distinct reef habitats and has the potential to enable further insights for reef restoration projects.

NEAug 4, 2022
Evolutionary bagging for ensemble learning

Giang Ngo, Rodney Beard, Rohitash Chandra

Ensemble learning has gained success in machine learning with major advantages over other learning methods. Bagging is a prominent ensemble learning method that creates subgroups of data, known as bags, that are trained by individual machine learning methods such as decision trees. Random forest is a prominent example of bagging with additional features in the learning process. Evolutionary algorithms have been prominent for optimisation problems and also been used for machine learning. Evolutionary algorithms are gradient-free methods that work with a population of candidate solutions that maintain diversity for creating new solutions. In conventional bagged ensemble learning, the bags are created once and the content, in terms of the training examples, are fixed over the learning process. In our paper, we propose evolutionary bagged ensemble learning, where we utilise evolutionary algorithms to evolve the content of the bags in order to iteratively enhance the ensemble by providing diversity in the bags. The results show that our evolutionary ensemble bagging method outperforms conventional ensemble methods (bagging and random forests) for several benchmark datasets under certain constraints. We find that evolutionary bagging can inherently sustain a diverse set of bags without reduction in performance accuracy.

MLApr 2, 2023
Bayesian neural networks via MCMC: a Python-based tutorial

Rohitash Chandra, Joshua Simmons

Bayesian inference provides a methodology for parameter estimation and uncertainty quantification in machine learning and deep learning methods. Variational inference and Markov Chain Monte-Carlo (MCMC) sampling methods are used to implement Bayesian inference. In the past three decades, MCMC sampling methods have faced some challenges in being adapted to larger models (such as in deep learning) and big data problems. Advanced proposal distributions that incorporate gradients, such as a Langevin proposal distribution, provide a means to address some of the limitations of MCMC sampling for Bayesian neural networks. Furthermore, MCMC methods have typically been constrained to statisticians and currently not well-known among deep learning researchers. We present a tutorial for MCMC methods that covers simple Bayesian linear and logistic models, and Bayesian neural networks. The aim of this tutorial is to bridge the gap between theory and implementation via coding, given a general sparsity of libraries and tutorials to this end. This tutorial provides code in Python with data and instructions that enable their use and extension. We provide results for some benchmark problems showing the strengths and weaknesses of implementing the respective Bayesian models via MCMC. We highlight the challenges in sampling multi-modal posterior distributions for the case of Bayesian neural networks and the need for further improvement of convergence diagnosis methods.

CLMay 23, 2022
Artificial intelligence for topic modelling in Hindu philosophy: mapping themes between the Upanishads and the Bhagavad Gita

Rohitash Chandra, Mukul Ranjan

A distinct feature of Hindu religious and philosophical text is that they come from a library of texts rather than single source. The Upanishads is known as one of the oldest philosophical texts in the world that forms the foundation of Hindu philosophy. The Bhagavad Gita is core text of Hindu philosophy and is known as a text that summarises the key philosophies of the Upanishads with major focus on the philosophy of karma. These texts have been translated into many languages and there exists studies about themes and topics that are prominent; however, there is not much study of topic modelling using language models which are powered by deep learning. In this paper, we use advanced language produces such as BERT to provide topic modelling of the key texts of the Upanishads and the Bhagavad Gita. We analyse the distinct and overlapping topics amongst the texts and visualise the link of selected texts of the Upanishads with Bhagavad Gita. Our results show a very high similarity between the topics of these two texts with the mean cosine similarity of 73%. We find that out of the fourteen topics extracted from the Bhagavad Gita, nine of them have a cosine similarity of more than 70% with the topics of the Upanishads. We also found that topics generated by the BERT-based models show very high coherence as compared to that of conventional models. Our best performing model gives a coherence score of 73% on the Bhagavad Gita and 69% on The Upanishads. The visualization of the low dimensional embeddings of these texts shows very clear overlapping among their topics adding another level of validation to our results.

GNApr 21, 2023
Multi-Modal Deep Learning for Credit Rating Prediction Using Text and Numerical Data Streams

Mahsa Tavakoli, Rohitash Chandra, Fengrui Tian et al.

Knowing which factors are significant in credit rating assignment leads to better decision-making. However, the focus of the literature thus far has been mostly on structured data, and fewer studies have addressed unstructured or multi-modal datasets. In this paper, we present an analysis of the most effective architectures for the fusion of deep learning models for the prediction of company credit rating classes, by using structured and unstructured datasets of different types. In these models, we tested different combinations of fusion strategies with different deep learning models, including CNN, LSTM, GRU, and BERT. We studied data fusion strategies in terms of level (including early and intermediate fusion) and techniques (including concatenation and cross-attention). Our results show that a CNN-based multi-modal model with two fusion strategies outperformed other multi-modal techniques. In addition, by comparing simple architectures with more complex ones, we found that more sophisticated deep learning models do not necessarily produce the highest performance; however, if attention-based models are producing the best results, cross-attention is necessary as a fusion strategy. Finally, our comparison of rating agencies on short-, medium-, and long-term performance shows that Moody's credit ratings outperform those of other agencies like Standard & Poor's and Fitch Ratings.

CLFeb 28, 2023
An evaluation of Google Translate for Sanskrit to English translation via sentiment and semantic analysis

Akshat Shukla, Chaarvi Bansal, Sushrut Badhe et al.

Google Translate has been prominent for language translation; however, limited work has been done in evaluating the quality of translation when compared to human experts. Sanskrit one of the oldest written languages in the world. In 2022, the Sanskrit language was added to the Google Translate engine. Sanskrit is known as the mother of languages such as Hindi and an ancient source of the Indo-European group of languages. Sanskrit is the original language for sacred Hindu texts such as the Bhagavad Gita. In this study, we present a framework that evaluates the Google Translate for Sanskrit using the Bhagavad Gita. We first publish a translation of the Bhagavad Gita in Sanskrit using Google Translate. Our framework then compares Google Translate version of Bhagavad Gita with expert translations using sentiment and semantic analysis via BERT-based language models. Our results indicate that in terms of sentiment and semantic analysis, there is low level of similarity in selected verses of Google Translate when compared to expert translations. In the qualitative evaluation, we find that Google translate is unsuitable for translation of certain Sanskrit words and phrases due to its poetic nature, contextual significance, metaphor and imagery. The mistranslations are not surprising since the Bhagavad Gita is known as a difficult text not only to translate, but also to interpret since it relies on contextual, philosophical and historical information. Our framework lays the foundation for automatic evaluation of other languages by Google Translate

LGFeb 28, 2023
Deep learning for COVID-19 topic modelling via Twitter: Alpha, Delta and Omicron

Janhavi Lande, Arti Pillay, Rohitash Chandra

Topic modelling with innovative deep learning methods has gained interest for a wide range of applications that includes COVID-19. Topic modelling can provide, psychological, social and cultural insights for understanding human behaviour in extreme events such as the COVID-19 pandemic. In this paper, we use prominent deep learning-based language models for COVID-19 topic modelling taking into account data from emergence (Alpha) to the Omicron variant. We apply topic modeling to review the public behaviour across the first, second and third waves based on Twitter dataset from India. Our results show that the topics extracted for the subsequent waves had certain overlapping themes such as covers governance, vaccination, and pandemic management while novel issues aroused in political, social and economic situation during COVID-19 pandemic. We also found a strong correlation of the major topics qualitatively to news media prevalent at the respective time period. Hence, our framework has the potential to capture major issues arising during different phases of the COVID-19 pandemic which can be extended to other countries and regions.

SIJun 23, 2023
An analysis of vaccine-related sentiments from development to deployment of COVID-19 vaccines

Rohitash Chandra, Jayesh Sonawane, Janhavi Lande et al.

Anti-vaccine sentiments have been well-known and reported throughout the history of viral outbreaks and vaccination programmes. The COVID-19 pandemic had fear and uncertainty about vaccines which has been well expressed on social media platforms such as Twitter. We analyse Twitter sentiments from the beginning of the COVID-19 pandemic and study the public behaviour during the planning, development and deployment of vaccines expressed in tweets worldwide using a sentiment analysis framework via deep learning models. In this way, we provide visualisation and analysis of anti-vaccine sentiments over the course of the COVID-19 pandemic. Our results show a link between the number of tweets, the number of cases, and the change in sentiment polarity scores during major waves of COVID-19 cases. We also found that the first half of the pandemic had drastic changes in the sentiment polarity scores that later stabilised which implies that the vaccine rollout had an impact on the nature of discussions on social media.

OTAug 1, 2022
Unsupervised machine learning framework for discriminating major variants of concern during COVID-19

Rohitash Chandra, Chaarvi Bansal, Mingyue Kang et al.

Due to the high mutation rate of the virus, the COVID-19 pandemic evolved rapidly. Certain variants of the virus, such as Delta and Omicron, emerged with altered viral properties leading to severe transmission and death rates. These variants burdened the medical systems worldwide with a major impact to travel, productivity, and the world economy. Unsupervised machine learning methods have the ability to compress, characterize, and visualize unlabelled data. This paper presents a framework that utilizes unsupervised machine learning methods to discriminate and visualize the associations between major COVID-19 variants based on their genome sequences. These methods comprise a combination of selected dimensionality reduction and clustering techniques. The framework processes the RNA sequences by performing a k-mer analysis on the data and further visualises and compares the results using selected dimensionality reduction methods that include principal component analysis (PCA), t-distributed stochastic neighbour embedding (t-SNE), and uniform manifold approximation projection (UMAP). Our framework also employs agglomerative hierarchical clustering to visualize the mutational differences among major variants of concern and country-wise mutational differences for selected variants (Delta and Omicron) using dendrograms. We also provide country-wise mutational differences for selected variants via dendrograms. We find that the proposed framework can effectively distinguish between the major variants and has the potential to identify emerging variants in the future.

LGJan 25, 2023
Recursive deep learning framework for forecasting the decadal world economic outlook

Tianyi Wang, Rodney Beard, John Hawkins et al.

The gross domestic product (GDP) is the most widely used indicator in macroeconomics and the main tool for measuring a country's economic output. Due to the diversity and complexity of the world economy, a wide range of models have been used, but there are challenges in making decadal GDP forecasts given unexpected changes such as emergence of catastrophic world events including pandemics and wars. Deep learning models are well suited for modelling temporal sequences and time series forecasting. In this paper, we develop a deep learning framework to forecast the GDP growth rate of the world economy over a decade. We use the Penn World Table as the data source featuring 13 countries prior to the COVID-19 pandemic, such as Australia, China, India, and the United States. We present a recursive deep learning framework to predict the GDP growth rate in the next ten years. We test prominent deep learning models and compare their results with traditional econometric models for selected developed and developing countries. Our decadal forecasts reveal that that most of the developed countries would experience economic growth slowdown, stagnation and even recession within five years (2020-2024). Furthermore, our model forecasts show that only China, France, and India would experience stable GDP growth.

NEApr 24, 2022
MAP-Elites based Hyper-Heuristic for the Resource Constrained Project Scheduling Problem

Shelvin Chand, Kousik Rajesh, Rohitash Chandra

The resource constrained project scheduling problem (RCPSP) is an NP-Hard combinatorial optimization problem. The objective of RCPSP is to schedule a set of activities without violating any activity precedence or resource constraints. In recent years researchers have moved away from complex solution methodologies, such as meta heuristics and exact mathematical approaches, towards more simple intuitive solutions like priority rules. This often involves using a genetic programming based hyper-heuristic (GPHH) to discover new priority rules which can be applied to new unseen cases. A common problem affecting GPHH is diversity in evolution which often leads to poor quality output. In this paper, we present a MAP-Elites based hyper-heuristic (MEHH) for the automated discovery of efficient priority rules for RCPSP. MAP-Elites uses a quality diversity based approach which explicitly maintains an archive of diverse solutions characterised along multiple feature dimensions. In order to demonstrate the benefits of our proposed hyper-heuristic, we compare the overall performance against a traditional GPHH and priority rules proposed by human experts. Our results indicate strong improvements in both diversity and performance. In particular we see major improvements for larger instances which have been under-studied in the existing literature.

CVMay 19
Remote sensing data imputation using deep learning for multispectral imagery

Shuang Liua, Fiona Johnson, Rohitash Chandra

Remote sensing techniques have been increasingly utilised in aquatic applications in recent years. A common challenge in using optical satellite data is the presence of missing observations due to cloud cover. These data gaps can lead to missed detection of critical events, such as algal blooms, in lakes of high interest to water authorities. As a result, enhancing the completeness of optical satellite datasets is crucial for improving the monitoring and prediction of algal blooms. In this study, we compared a traditional data imputation method (i.e., linear interpolation) with deep learning models for reconstructing missing spectral bands across four lakes with historical records of algal blooms. The deep learning models adopted include CNN-based architectures (i.e., CNN, Inception Resnet, and Autoencoder) and CNN-LSTM-based architectures (i.e., CNN-LSTM, Resnet-LSTM, and Autoencoder-LSTM). Our results demonstrated that deep learning models substantially outperformed the baseline linear interpolation method in imputing spectral band values within artificially masked regions. Among these models, CNN delivered the best performance across most lakes. Furthermore, we evaluated the performance of algal bloom indices (i.e., Green/Red and NDCI) derived from the imputed imagery by comparing them with the observed data. Our results demonstrate that deep learning models are effective for imputing missing data in PlanetScope SuperDove imagery, enabling more reliable applications in water monitoring.

MLMar 28
Bayes-MICE: A Bayesian Approach to Multiple Imputation for Time Series Data

Amuche Ibenegbu, Pierre Lafaye de Micheaux, Rohitash Chandra

Time-series analysis is often affected by missing data, a common problem across several fields, including healthcare and environmental monitoring. Multiple Imputation by Chained Equations (MICE) has been prominent for imputing missing values through "fully conditional specification". We extend MICE using the Bayesian framework (Bayes-MICE), utilising Bayesian inference to impute missing values via Markov Chain Monte Carlo (MCMC) sampling to account for uncertainty in MICE model parameters and imputed values. We also include temporally informed initialisation and time-lagged features in the model to respect the sequential nature of time-series data. We evaluate the Bayes-MICE method using two real-world datasets (AirQuality and PhysioNet), and using both the Random Walk Metropolis (RWM) and the Metropolis-Adjusted Langevin Algorithm (MALA) samplers. Our results demonstrate that Bayes-MICE reduces imputation errors relative to the baseline methods over all variables and accounts for uncertainty in the imputation process, thereby providing a more accurate measure of imputation error. We also found that MALA converges faster than RWM, achieving comparable accuracy while providing more consistent posterior exploration. Overall, these findings suggest that the Bayes-MICE framework represents a practical and efficient approach to time-series imputation, balancing increased accuracy with meaningful quantification of uncertainty in various environmental and clinical settings.

LGJul 20, 2024
Ensemble quantile-based deep learning framework for streamflow and flood prediction in Australian catchments

Rohitash Chandra, Arpit Kapoor, Siddharth Khedkar et al.

In recent years, climate extremes such as floods have created significant environmental and economic hazards for Australia. Deep learning methods have been promising for predicting extreme climate events; however, large flooding events present a critical challenge due to factors such as model calibration and missing data. We present an ensemble quantile-based deep learning framework that addresses large-scale streamflow forecasts using quantile regression for uncertainty projections in prediction. We evaluate selected univariate and multivariate deep learning models and catchment strategies. Furthermore, we implement a multistep time-series prediction model using the CAMELS dataset for selected catchments across Australia. The ensemble model employs a set of quantile deep learning models for streamflow determined by historical streamflow data. We utilise the streamflow prediction and obtain flood probability using flood frequency analysis and compare it with historical flooding events for selected catchments. Our results demonstrate notable efficacy and uncertainties in streamflow forecasts with varied catchment properties. Our flood probability estimates show good accuracy in capturing the historical floods from the selected catchments. This underscores the potential for our deep learning framework to revolutionise flood forecasting across diverse regions and be implemented as an early warning system.

CLSep 8, 2024
Evaluation of Google Translate for Mandarin Chinese translation using sentiment and semantic analysis

Xuechun Wang, Rodney Beard, Rohitash Chandra

Machine translation using large language models (LLMs) is having a significant global impact, making communication easier. Mandarin Chinese is the official language used for communication by the government and media in China. In this study, we provide an automated assessment of translation quality of Google Translate with human experts using sentiment and semantic analysis. In order to demonstrate our framework, we select the classic early twentieth-century novel 'The True Story of Ah Q' with selected Mandarin Chinese to English translations. We use Google Translate to translate the given text into English and then conduct a chapter-wise sentiment analysis and semantic analysis to compare the extracted sentiments across the different translations. Our results indicate that the precision of Google Translate differs both in terms of semantic and sentiment analysis when compared to human expert translations. We find that Google Translate is unable to translate some of the specific words or phrases in Chinese, such as Chinese traditional allusions. The mistranslations may be due to lack of contextual significance and historical knowledge of China.

CLAug 29, 2024
A longitudinal sentiment analysis of Sinophobia during COVID-19 using large language models

Chen Wang, Rohitash Chandra

The COVID-19 pandemic has exacerbated xenophobia, particularly Sinophobia, leading to widespread discrimination against individuals of Chinese descent. Large language models (LLMs) are pre-trained deep learning models used for natural language processing (NLP) tasks. The ability of LLMs to understand and generate human-like text makes them particularly useful for analysing social media data to detect and evaluate sentiments. We present a sentiment analysis framework utilising LLMs for longitudinal sentiment analysis of the Sinophobic sentiments expressed in X (Twitter) during the COVID-19 pandemic. The results show a significant correlation between the spikes in Sinophobic tweets, Sinophobic sentiments and surges in COVID-19 cases, revealing that the evolution of the pandemic influenced public sentiment and the prevalence of Sinophobic discourse. Furthermore, the sentiment analysis revealed a predominant presence of negative sentiments, such as annoyance and denial, which underscores the impact of political narratives and misinformation shaping public opinion. The lack of empathetic sentiment which was present in previous studies related to COVID-19 highlights the way the political narratives in media viewed the pandemic and how it blamed the Chinese community. Our study highlights the importance of transparent communication in mitigating xenophobic sentiments during global crises.

CYJan 5
An evaluation of LLMs for political bias in Western media: Israel-Hamas and Ukraine-Russia wars

Rohitash Chandra, Haoyan Chen, Yaqing Zhang et al.

Political bias in media plays a critical role in shaping public opinion, voter behaviour, and broader democratic discourse. Subjective opinions and political bias can be found in media sources, such as newspapers, depending on their funding mechanisms and alliances with political parties. Automating the detection of political biases in media content can limit biases in elections. The impact of large language models (LLMs) in politics and media studies is becoming prominent. In this study, we utilise LLMs to compare the left-wing, right-wing, and neutral political opinions expressed in the Guardian and BBC. We review newspaper reporting that includes significant events such as the Russia-Ukraine war and the Hamas-Israel conflict. We analyse the proportion for each opinion to find the bias under different LLMs, including BERT, Gemini, and DeepSeek. Our results show that after the outbreak of the wars, the political bias of Western media shifts towards the left-wing and each LLM gives a different result. DeepSeek consistently showed a stable Left-leaning tendency, while BERT and Gemini remained closer to the Centre. The BBC and The Guardian showed distinct reporting behaviours across the two conflicts. In the Russia-Ukraine war, both outlets maintained relatively stable positions; however, in the Israel-Hamas conflict, we identified larger political bias shifts, particularly in Guardian coverage, suggesting a more event-driven pattern of reporting bias. These variations suggest that LLMs are shaped not only by their training data and architecture, but also by underlying worldviews with associated political biases.

CLNov 14, 2025
Analysing Personal Attacks in U.S. Presidential Debates

Ruban Goyal, Rohitash Chandra, Sonit Singh

Personal attacks have become a notable feature of U.S. presidential debates and play an important role in shaping public perception during elections. Detecting such attacks can improve transparency in political discourse and provide insights for journalists, analysts and the public. Advances in deep learning and transformer-based models, particularly BERT and large language models (LLMs) have created new opportunities for automated detection of harmful language. Motivated by these developments, we present a framework for analysing personal attacks in U.S. presidential debates. Our work involves manual annotation of debate transcripts across the 2016, 2020 and 2024 election cycles, followed by statistical and language-model based analysis. We investigate the potential of fine-tuned transformer models alongside general-purpose LLMs to detect personal attacks in formal political speech. This study demonstrates how task-specific adaptation of modern language models can contribute to a deeper understanding of political communication.

LGApr 2, 2024
Remote sensing framework for geological mapping via stacked autoencoders and clustering

Sandeep Nagar, Ehsan Farahbakhsh, Joseph Awange et al.

Supervised machine learning methods for geological mapping via remote sensing face limitations due to the scarcity of accurately labelled training data that can be addressed by unsupervised learning, such as dimensionality reduction and clustering. Dimensionality reduction methods have the potential to play a crucial role in improving the accuracy of geological maps. Although conventional dimensionality reduction methods may struggle with nonlinear data, unsupervised deep learning models such as autoencoders can model non-linear relationships. Stacked autoencoders feature multiple interconnected layers to capture hierarchical data representations useful for remote sensing data. We present an unsupervised machine learning-based framework for processing remote sensing data using stacked autoencoders for dimensionality reduction and k-means clustering for mapping geological units. We use Landsat 8, ASTER, and Sentinel-2 datasets to evaluate the framework for geological mapping of the Mutawintji region in Western New South Wales, Australia. We also compare stacked autoencoders with principal component analysis (PCA) and canonical autoencoders. Our results reveal that the framework produces accurate and interpretable geological maps, efficiently discriminating rock units. The results reveal that the combination of stacked autoencoders with Sentinel-2 data yields the best performance accuracy when compared to other combinations. We find that stacked autoencoders enable better extraction of complex and hierarchical representations of the input data when compared to canonical autoencoders and PCA. We also find that the generated maps align with prior geological knowledge of the study area while providing novel insights into geological structures.

IVJan 1, 2024
Self-supervised learning for skin cancer diagnosis with limited training data

Hamish Haggerty, Rohitash Chandra

Early cancer detection is crucial for prognosis, but many cancer types lack large labelled datasets required for developing deep learning models. This paper investigates self-supervised learning (SSL) as an alternative to the standard supervised pre-training on ImageNet for scenarios with limited training data using a deep learning model (ResNet-50). We first demonstrate that SSL pre-training on ImageNet (via the Barlow Twins SSL algorithm) outperforms supervised pre-training (SL) using a skin lesion dataset with limited training samples. We then consider \textit{further} SSL pre-training (of the two ImageNet pre-trained models) on task-specific datasets, where our implementation is motivated by supervised transfer learning. This approach significantly enhances initially SL pre-trained models, closing the performance gap with initially SSL pre-trained ones. Surprisingly, further pre-training on just the limited fine-tuning data achieves this performance equivalence. Linear probe experiments reveal that improvement stems from enhanced feature extraction. Hence, we find that minimal further SSL pre-training on task-specific data can be as effective as large-scale SSL pre-training on ImageNet for medical image classification tasks with limited labelled data. We validate these results on an oral cancer histopathology dataset, suggesting broader applicability across medical imaging domains facing labelled data scarcity.

CLMay 20, 2024
Large language models for newspaper sentiment analysis during COVID-19: The Guardian

Rohitash Chandra, Baicheng Zhu, Qingying Fang et al.

During the COVID-19 pandemic, the news media coverage encompassed a wide range of topics that includes viral transmission, allocation of medical resources, and government response measures. There have been studies on sentiment analysis of social media platforms during COVID-19 to understand the public response given the rise of cases and government strategies implemented to control the spread of the virus. Sentiment analysis can provide a better understanding of changes in societal opinions and emotional trends during the pandemic. Apart from social media, newspapers have played a vital role in the dissemination of information, including information from the government, experts, and also the public about various topics. A study of sentiment analysis of newspaper sources during COVID-19 for selected countries can give an overview of how the media covered the pandemic. In this study, we select The Guardian newspaper and provide a sentiment analysis during various stages of COVID-19 that includes initial transmission, lockdowns and vaccination. We employ novel large language models (LLMs) and refine them with expert-labelled sentiment analysis data. We also provide an analysis of sentiments experienced pre-pandemic for comparison. The results indicate that during the early pandemic stages, public sentiment prioritised urgent crisis response, later shifting focus to addressing the impact on health and the economy. In comparison with related studies about social media sentiment analyses, we found a discrepancy between The Guardian with dominance of negative sentiments (sad, annoyed, anxious and denial), suggesting that social media offers a more diversified emotional reflection. We found a grim narrative in The Guardian with overall dominance of negative sentiments, pre and during COVID-19 across news sections including Australia, UK, World News, and Opinion

CVFeb 25, 2025
Convolutional neural networks for mineral prospecting through alteration mapping with remote sensing data

Ehsan Farahbakhsh, Dakshi Goel, Dhiraj Pimparkar et al.

Traditional geological mapping, based on field observations and rock sample analysis, is inefficient for continuous spatial mapping of features like alteration zones. Deep learning models, such as convolutional neural networks (CNNs), have revolutionised remote sensing data analysis by automatically extracting features for classification and regression tasks. CNNs can detect specific mineralogical changes linked to mineralisation by identifying subtle features in remote sensing data. This study uses CNNs with Landsat 8, Landsat 9, and ASTER data to map alteration zones north of Broken Hill, New South Wales, Australia. The model is trained using ground truth data and an automated approach with selective principal component analysis (PCA). We compare CNNs with traditional machine learning models, including k-nearest neighbours, support vector machines, and multilayer perceptron. Results show that ground truth-based training yields more reliable maps, with CNNs slightly outperforming conventional models in capturing spatial patterns. Landsat 9 outperforms Landsat 8 in mapping iron oxide areas using ground truth-trained CNNs, while ASTER data provides the most accurate argillic and propylitic alteration maps. This highlights CNNs' effectiveness in improving geological mapping precision, especially for identifying subtle mineralisation-related alterations.

LGNov 24, 2024
Quantile deep learning models for multi-step ahead time series prediction

Jimmy Cheung, Smruthi Rangarajan, Amelia Maddocks et al.

Uncertainty quantification is crucial in time series prediction, and quantile regression offers a valuable mechanism for uncertainty quantification which is useful for extreme value forecasting. Although deep learning models have been prominent in multi-step ahead prediction, the development and evaluation of quantile deep learning models have been limited. We present a novel quantile regression deep learning framework for multi-step time series prediction. In this way, we elevate the capabilities of deep learning models by incorporating quantile regression, thus providing a more nuanced understanding of predictive values. We provide an implementation of prominent deep learning models for multi-step ahead time series prediction and evaluate their performance under high volatility and extreme conditions. We include multivariate and univariate modelling, strategies and provide a comparison with conventional deep learning models from the literature. Our models are tested on two cryptocurrencies: Bitcoin and Ethereum, using daily close-price data and selected benchmark time series datasets. The results show that integrating a quantile loss function with deep learning provides additional predictions for selected quantiles without a loss in the prediction accuracy when compared to the literature. Our quantile model has the ability to handle volatility more effectively and provides additional information for decision-making and uncertainty quantification through the use of quantiles when compared to conventional deep learning models.

LGJun 10, 2025
Spatiotemporal deep learning models for detection of rapid intensification in cyclones

Vamshika Sutar, Amandeep Singh, Rohitash Chandra

Cyclone rapid intensification is the rapid increase in cyclone wind intensity, exceeding a threshold of 30 knots, within 24 hours. Rapid intensification is considered an extreme event during a cyclone, and its occurrence is relatively rare, contributing to a class imbalance in the dataset. A diverse array of factors influences the likelihood of a cyclone undergoing rapid intensification, further complicating the task for conventional machine learning models. In this paper, we evaluate deep learning, ensemble learning and data augmentation frameworks to detect cyclone rapid intensification based on wind intensity and spatial coordinates. We note that conventional data augmentation methods cannot be utilised for generating spatiotemporal patterns replicating cyclones that undergo rapid intensification. Therefore, our framework employs deep learning models to generate spatial coordinates and wind intensity that replicate cyclones to address the class imbalance problem of rapid intensification. We also use a deep learning model for the classification module within the data augmentation framework to differentiate between rapid and non-rapid intensification events during a cyclone. Our results show that data augmentation improves the results for rapid intensification detection in cyclones, and spatial coordinates play a critical role as input features to the given models. This paves the way for research in synthetic data generation for spatiotemporal data with extreme events.

AIMay 30, 2025
Evaluation of LLMs for mathematical problem solving

Ruonan Wang, Runxi Wang, Yunwen Shen et al.

Large Language Models (LLMs) have shown impressive performance on a range of educational tasks, but are still understudied for their potential to solve mathematical problems. In this study, we compare three prominent LLMs, including GPT-4o, DeepSeek-V3, and Gemini-2.0, on three mathematics datasets of varying complexities (GSM8K, MATH500, and MIT Open Courseware datasets). We take a five-dimensional approach based on the Structured Chain-of-Thought (SCoT) framework to assess final answer correctness, step completeness, step validity, intermediate calculation accuracy, and problem comprehension. The results show that GPT-4o is the most stable and consistent in performance across all the datasets, but particularly it performs outstandingly in high-level questions of the MIT Open Courseware dataset. DeepSeek-V3 is competitively strong in well-structured domains such as optimisation, but suffers from fluctuations in accuracy in statistical inference tasks. Gemini-2.0 shows strong linguistic understanding and clarity in well-structured problems but performs poorly in multi-step reasoning and symbolic logic. Our error analysis reveals particular deficits in each model: GPT-4o is at times lacking in sufficient explanation or precision; DeepSeek-V3 leaves out intermediate steps; and Gemini-2.0 is less flexible in mathematical reasoning in higher dimensions.

LGJan 12, 2025
Compact Bayesian Neural Networks via pruned MCMC sampling

Ratneel Deo, Scott Sisson, Jody M. Webster et al.

Bayesian Neural Networks (BNNs) offer robust uncertainty quantification in model predictions, but training them presents a significant computational challenge. This is mainly due to the problem of sampling multimodal posterior distributions using Markov Chain Monte Carlo (MCMC) sampling and variational inference algorithms. Moreover, the number of model parameters scales exponentially with additional hidden layers, neurons, and features in the dataset. Typically, a significant portion of these densely connected parameters are redundant and pruning a neural network not only improves portability but also has the potential for better generalisation capabilities. In this study, we address some of the challenges by leveraging MCMC sampling with network pruning to obtain compact probabilistic models having removed redundant parameters. We sample the posterior distribution of model parameters (weights and biases) and prune weights with low importance, resulting in a compact model. We ensure that the compact BNN retains its ability to estimate uncertainty via the posterior distribution while retaining the model training and generalisation performance accuracy by adapting post-pruning resampling. We evaluate the effectiveness of our MCMC pruning strategy on selected benchmark datasets for regression and classification problems through empirical result analysis. We also consider two coral reef drill-core lithology classification datasets to test the robustness of the pruning model in complex real-world datasets. We further investigate if refining compact BNN can retain any loss of performance. Our results demonstrate the feasibility of training and pruning BNNs using MCMC whilst retaining generalisation performance with over 75% reduction in network size. This paves the way for developing compact BNN models that provide uncertainty estimates for real-world applications.

CLMar 27, 2025
An evaluation of LLMs and Google Translate for translation of selected Indian languages via sentiment and semantic analyses

Rohitash Chandra, Aryan Chaudhari, Yeshwanth Rayavarapu

Large Language models (LLMs) have been prominent for language translation, including low-resource languages. There has been limited study on the assessment of the quality of translations generated by LLMs, including Gemini, GPT, and Google Translate. This study addresses this limitation by using semantic and sentiment analysis of selected LLMs for Indian languages, including Sanskrit, Telugu and Hindi. We select prominent texts (Bhagavad Gita, Tamas and Maha Prasthanam ) that have been well translated by experts and use LLMs to generate their translations into English, and provide a comparison with selected expert (human) translations. Our investigation revealed that while LLMs have made significant progress in translation accuracy, challenges remain in preserving sentiment and semantic integrity, especially in metaphorical and philosophical contexts for texts such as the Bhagavad Gita. The sentiment analysis revealed that GPT models are better at preserving the sentiment polarity for the given texts when compared to human (expert) translation. The results revealed that GPT models are generally better at maintaining the sentiment and semantics when compared to Google Translate. This study could help in the development of accurate and culturally sensitive translation systems for large language models.

IRFeb 26, 2025
Multiview graph dual-attention deep learning and contrastive learning for multi-criteria recommender systems

Saman Forouzandeh, Pavel N. Krivitsky, Rohitash Chandra

Recommender systems leveraging deep learning models have been crucial for assisting users in selecting items aligned with their preferences and interests. However, a significant challenge persists in single-criteria recommender systems, which often overlook the diverse attributes of items that have been addressed by Multi-Criteria Recommender Systems (MCRS). Shared embedding vector for multi-criteria item ratings but have struggled to capture the nuanced relationships between users and items based on specific criteria. In this study, we present a novel representation for Multi-Criteria Recommender Systems (MCRS) based on a multi-edge bipartite graph, where each edge represents one criterion rating of items by users, and Multiview Dual Graph Attention Networks (MDGAT). Employing MDGAT is beneficial and important for adequately considering all relations between users and items, given the presence of both local (criterion-based) and global (multi-criteria) relations. Additionally, we define anchor points in each view based on similarity and employ local and global contrastive learning to distinguish between positive and negative samples across each view and the entire graph. We evaluate our method on two real-world datasets and assess its performance based on item rating predictions. The results demonstrate that our method achieves higher accuracy compared to the baseline method for predicting item ratings on the same datasets. MDGAT effectively capture the local and global impact of neighbours and the similarity between nodes.

CLJan 20, 2025
Longitudinal Abuse and Sentiment Analysis of Hollywood Movie Dialogues using Language Models

Rohitash Chandra, Guoxiang Ren, Group-H

Over the past decades, there has been an increase in the prevalence of abusive and violent content in Hollywood movies. In this study, we use language models to explore the longitudinal abuse and sentiment analysis of Hollywood Oscar and blockbuster movie dialogues from 1950 to 2024. We provide an analysis of subtitles for over a thousand movies, which are categorised into four genres. We employ fine-tuned language models to examine the trends and shifts in emotional and abusive content over the past seven decades. Findings reveal significant temporal changes in movie dialogues, which reflect broader social and cultural influences. Overall, the emotional tendencies in the films are diverse, and the detection of abusive content also exhibits significant fluctuations. The results show a gradual rise in abusive content in recent decades, reflecting social norms and regulatory policy changes. Genres such as thrillers still present a higher frequency of abusive content that emphasises the ongoing narrative role of violence and conflict. At the same time, underlying positive emotions such as humour and optimism remain prevalent in most of the movies. Furthermore, the gradual increase of abusive content in movie dialogues has been significant over the last two decades, where Oscar-nominated movies overtook the top ten blockbusters.

CLJan 7, 2025
HP-BERT: A framework for longitudinal study of Hinduphobia on social media via language models

Ashutosh Singh, Rohitash Chandra

During the COVID-19 pandemic, community tensions intensified, contributing to discriminatory sentiments against various religious groups, including Hindu communities. Recent advances in language models have shown promise for social media analysis with potential for longitudinal studies of social media platforms, such as X (Twitter). We present a computational framework for analyzing anti-Hindu sentiment (Hinduphobia) during the COVID-19 period, introducing an abuse detection and sentiment analysis approach for longitudinal analysis on X. We curate and release a "Hinduphobic COVID-19 XDataset" containing 8,000 annotated and manually verified tweets. We then develop the Hinduphobic BERT (HP-BERT) model using this dataset and achieve 94.72\% accuracy, outperforming baseline Transformer-based language models. The model incorporates multi-label sentiment analysis capabilities through additional fine-tuning. Our analysis encompasses approximately 27.4 million tweets from six countries, including Australia, Brazil, India, Indonesia, Japan, and the United Kingdom. Statistical analysis reveals moderate correlations (r = 0.312-0.428) between COVID-19 case increases and Hinduphobic content volume, highlighting how pandemic-related stress may contribute to discriminatory discourse. This study provides evidence of social media-based religious discrimination during a COVID-19 crisis.

CLJan 1, 2024
Large language model for Bible sentiment analysis: Sermon on the Mount

Mahek Vora, Tom Blau, Vansh Kachhwal et al.

The revolution of natural language processing via large language models has motivated its use in multidisciplinary areas that include social sciences and humanities and more specifically, comparative religion. Sentiment analysis provides a mechanism to study the emotions expressed in text. Recently, sentiment analysis has been used to study and compare translations of the Bhagavad Gita, which is a fundamental and sacred Hindu text. In this study, we use sentiment analysis for studying selected chapters of the Bible. These chapters are known as the Sermon on the Mount. We utilize a pre-trained language model for sentiment analysis by reviewing five translations of the Sermon on the Mount, which include the King James version, the New International Version, the New Revised Standard Version, the Lamsa Version, and the Basic English Version. We provide a chapter-by-chapter and verse-by-verse comparison using sentiment and semantic analysis and review the major sentiments expressed. Our results highlight the varying sentiments across the chapters and verses. We found that the vocabulary of the respective translations is significantly different. We detected different levels of humour, optimism, and empathy in the respective chapters that were used by Jesus to deliver his message.

SDJan 21
Abusive music and song transformation using GenAI and LLMs

Jiyang Choi, Rohitash Chandra

Repeated exposure to violence and abusive content in music and song content can influence listeners' emotions and behaviours, potentially normalising aggression or reinforcing harmful stereotypes. In this study, we explore the use of generative artificial intelligence (GenAI) and Large Language Models (LLMs) to automatically transform abusive words (vocal delivery) and lyrical content in popular music. Rather than simply muting or replacing a single word, our approach transforms the tone, intensity, and sentiment, thus not altering just the lyrics, but how it is expressed. We present a comparative analysis of four selected English songs and their transformed counterparts, evaluating changes through both acoustic and sentiment-based lenses. Our findings indicate that Gen-AI significantly reduces vocal aggressiveness, with acoustic analysis showing improvements in Harmonic to Noise Ratio, Cepstral Peak Prominence, and Shimmer. Sentiment analysis reduced aggression by 63.3-85.6\% across artists, with major improvements in chorus sections (up to 88.6\% reduction). The transformed versions maintained musical coherence while mitigating harmful content, offering a promising alternative to traditional content moderation that avoids triggering the "forbidden fruit" effect, where the censored content becomes more appealing simply because it is restricted. This approach demonstrates the potential for GenAI to create safer listening experiences while preserving artistic expression.

LGOct 28, 2025
DynBERG: Dynamic BERT-based Graph neural network for financial fraud detection

Omkar Kulkarni, Rohitash Chandra

Financial fraud detection is critical for maintaining the integrity of financial systems, particularly in decentralised environments such as cryptocurrency networks. Although Graph Convolutional Networks (GCNs) are widely used for financial fraud detection, graph Transformer models such as Graph-BERT are gaining prominence due to their Transformer-based architecture, which mitigates issues such as over-smoothing. Graph-BERT is designed for static graphs and primarily evaluated on citation networks with undirected edges. However, financial transaction networks are inherently dynamic, with evolving structures and directed edges representing the flow of money. To address these challenges, we introduce DynBERG, a novel architecture that integrates Graph-BERT with a Gated Recurrent Unit (GRU) layer to capture temporal evolution over multiple time steps. Additionally, we modify the underlying algorithm to support directed edges, making DynBERG well-suited for dynamic financial transaction analysis. We evaluate our model on the Elliptic dataset, which includes Bitcoin transactions, including all transactions during a major cryptocurrency market event, the Dark Market Shutdown. By assessing DynBERG's resilience before and after this event, we analyse its ability to adapt to significant market shifts that impact transaction behaviours. Our model is benchmarked against state-of-the-art dynamic graph classification approaches, such as EvolveGCN and GCN, demonstrating superior performance, outperforming EvolveGCN before the market shutdown and surpassing GCN after the event. Additionally, an ablation study highlights the critical role of incorporating a time-series deep learning component, showcasing the effectiveness of GRU in modelling the temporal dynamics of financial transactions.

LGOct 6, 2025
QDeepGR4J: Quantile-based ensemble of deep learning and GR4J hybrid rainfall-runoff models for extreme flow prediction with uncertainty quantification

Arpit Kapoor, Rohitash Chandra

Conceptual rainfall-runoff models aid hydrologists and climate scientists in modelling streamflow to inform water management practices. Recent advances in deep learning have unravelled the potential for combining hydrological models with deep learning models for better interpretability and improved predictive performance. In our previous work, we introduced DeepGR4J, which enhanced the GR4J conceptual rainfall-runoff model using a deep learning model to serve as a surrogate for the routing component. DeepGR4J had an improved rainfall-runoff prediction accuracy, particularly in arid catchments. Quantile regression models have been extensively used for quantifying uncertainty while aiding extreme value forecasting. In this paper, we extend DeepGR4J using a quantile regression-based ensemble learning framework to quantify uncertainty in streamflow prediction. We also leverage the uncertainty bounds to identify extreme flow events potentially leading to flooding. We further extend the model to multi-step streamflow predictions for uncertainty bounds. We design experiments for a detailed evaluation of the proposed framework using the CAMELS-Aus dataset. The results show that our proposed Quantile DeepGR4J framework improves the predictive accuracy and uncertainty interval quality (interval score) compared to baseline deep learning models. Furthermore, we carry out flood risk evaluation using Quantile DeepGR4J, and the results demonstrate its suitability as an early warning system.

CLOct 6, 2025
Language models for longitudinal analysis of abusive content in Billboard Music Charts

Rohitash Chandra, Yathin Suresh, Divyansh Raj Sinha et al.

There is no doubt that there has been a drastic increase in abusive and sexually explicit content in music, particularly in Billboard Music Charts. However, there is a lack of studies that validate the trend for effective policy development, as such content has harmful behavioural changes in children and youths. In this study, we utilise deep learning methods to analyse songs (lyrics) from Billboard Charts of the United States in the last seven decades. We provide a longitudinal study using deep learning and language models and review the evolution of content using sentiment analysis and abuse detection, including sexually explicit content. Our results show a significant rise in explicit content in popular music from 1990 onwards. Furthermore, we find an increasing prevalence of songs with lyrics containing profane, sexually explicit, and otherwise inappropriate language. The longitudinal analysis of the ability of language models to capture nuanced patterns in lyrical content, reflecting shifts in societal norms and language use over time.

LGOct 2, 2025
Extreme value forecasting using relevance-based data augmentation with deep learning models

Junru Hua, Rahul Ahluwalia, Rohitash Chandra

Data augmentation with generative adversarial networks (GANs) has been popular for class imbalance problems, mainly for pattern classification and computer vision-related applications. Extreme value forecasting is a challenging field that has various applications from finance to climate change problems. In this study, we present a data augmentation framework for extreme value forecasting. In this framework, our focus is on forecasting extreme values using deep learning models in combination with data augmentation models such as GANs and synthetic minority oversampling technique (SMOTE). We use deep learning models such as convolutional long short-term memory (Conv-LSTM) and bidirectional long short-term memory (BD-LSTM) networks for multistep ahead prediction featuring extremes. We investigate which data augmentation models are the most suitable, taking into account the prediction accuracy overall and at extreme regions, along with computational efficiency. We also present novel strategies for incorporating data augmentation, considering extreme values based on a relevance function. Our results indicate that the SMOTE-based strategy consistently demonstrated superior adaptability, leading to improved performance across both short- and long-horizon forecasts. Conv-LSTM and BD-LSTM exhibit complementary strengths: the former excels in periodic, stable datasets, while the latter performs better in chaotic or non-stationary sequences.

CLOct 2, 2025
Machine Learning for Detection and Analysis of Novel LLM Jailbreaks

John Hawkins, Aditya Pramar, Rodney Beard et al.

Large Language Models (LLMs) suffer from a range of vulnerabilities that allow malicious users to solicit undesirable responses through manipulation of the input text. These so-called jailbreak prompts are designed to trick the LLM into circumventing the safety guardrails put in place to keep responses acceptable to the developer's policies. In this study, we analyse the ability of different machine learning models to distinguish jailbreak prompts from genuine uses, including looking at our ability to identify jailbreaks that use previously unseen strategies. Our results indicate that using current datasets the best performance is achieved by fine tuning a Bidirectional Encoder Representations from Transformers (BERT) model end-to-end for identifying jailbreaks. We visualise the keywords that distinguish jailbreak from genuine prompts and conclude that explicit reflexivity in prompt structure could be a signal of jailbreak intention.

CVSep 16, 2025
Landcover classification and change detection using remote sensing and machine learning: a case study of Western Fiji

Yadvendra Gurjar, Ruoni Wan, Ehsan Farahbakhsh et al.

As a developing country, Fiji is facing rapid urbanisation, which is visible in the massive development projects that include housing, roads, and civil works. In this study, we present machine learning and remote sensing frameworks to compare land use and land cover change from 2013 to 2024 in Nadi, Fiji. The ultimate goal of this study is to provide technical support in land cover/land use modelling and change detection. We used Landsat-8 satellite image for the study region and created our training dataset with labels for supervised machine learning. We used Google Earth Engine and unsupervised machine learning via k-means clustering to generate the land cover map. We used convolutional neural networks to classify the selected regions' land cover types. We present a visualisation of change detection, highlighting urban area changes over time to monitor changes in the map.

CVAug 5, 2025
Deep learning framework for crater detection and identification on the Moon and Mars

Yihan Ma, Zeyang Yu, Rohitash Chandra

Impact craters are among the most prominent geomorphological features on planetary surfaces and are of substantial significance in planetary science research. Their spatial distribution and morphological characteristics provide critical information on planetary surface composition, geological history, and impact processes. In recent years, the rapid advancement of deep learning models has fostered significant interest in automated crater detection. In this paper, we apply advancements in deep learning models for impact crater detection and identification. We use novel models, including Convolutional Neural Networks (CNNs) and variants such as YOLO and ResNet. We present a framework that features a two-stage approach where the first stage features crater identification using simple classic CNN, ResNet-50 and YOLO. In the second stage, our framework employs YOLO-based detection for crater localisation. Therefore, we detect and identify different types of craters and present a summary report with remote sensing data for a selected region. We consider selected regions for craters and identification from Mars and the Moon based on remote sensing data. Our results indicate that YOLO demonstrates the most balanced crater detection performance, while ResNet-50 excels in identifying large craters with high precision.

CLJul 14, 2025
Abusive text transformation using LLMs

Rohitash Chandra, Jiyong Choi

Although Large Language Models (LLMs) have demonstrated significant advancements in natural language processing tasks, their effectiveness in the classification and transformation of abusive text into non-abusive versions remains an area for exploration. In this study, we aim to use LLMs to transform abusive text (tweets and reviews) featuring hate speech and swear words into non-abusive text, while retaining the intent of the text. We evaluate the performance of two state-of-the-art LLMs, such as Gemini, GPT-4o, DeekSeek and Groq, on their ability to identify abusive text. We them to transform and obtain a text that is clean from abusive and inappropriate content but maintains a similar level of sentiment and semantics, i.e. the transformed text needs to maintain its message. Afterwards, we evaluate the raw and transformed datasets with sentiment analysis and semantic analysis. Our results show Groq provides vastly different results when compared with other LLMs. We have identified similarities between GPT-4o and DeepSeek-V3.

CLMay 30, 2025
An evaluation of LLMs for generating movie reviews: GPT-4o, Gemini-2.0 and DeepSeek-V3

Brendan Sands, Yining Wang, Chenhao Xu et al.

Large language models (LLMs) have been prominent in various tasks, including text generation and summarisation. The applicability of LLMs to the generation of product reviews is gaining momentum, paving the way for the generation of movie reviews. In this study, we propose a framework that generates movie reviews using three LLMs (GPT-4o, DeepSeek-V3, and Gemini-2.0), and evaluate their performance by comparing the generated outputs with IMDb user reviews. We use movie subtitles and screenplays as input to the LLMs and investigate how they affect the quality of reviews generated. We review the LLM-based movie reviews in terms of vocabulary, sentiment polarity, similarity, and thematic consistency in comparison to IMDB user reviews. The results demonstrate that LLMs are capable of generating syntactically fluent and structurally complete movie reviews. Nevertheless, there is still a noticeable gap in emotional richness and stylistic coherence between LLM-generated and IMDb reviews, suggesting that further refinement is needed to improve the overall quality of movie review generation. We provided a survey-based analysis where participants were told to distinguish between LLM and IMDb user reviews. The results show that LLM-generated reviews are difficult to distinguish from IMDB user reviews. We found that DeepSeek-V3 produced the most balanced reviews, closely matching IMDb reviews. GPT-4o overemphasised positive emotions, while Gemini-2.0 captured negative emotions better but showed excessive emotional intensity.

LGFeb 8, 2025
Global Ease of Living Index: a machine learning framework for longitudinal analysis of major economies

Tanay Panat, Rohitash Chandra

The drastic changes in the global economy, geopolitical conditions, and disruptions such as the COVID-19 pandemic have impacted the cost of living and quality of life. It is important to understand the long-term nature of the cost of living and quality of life in major economies. A transparent and comprehensive living index must include multiple dimensions of living conditions. In this study, we present an approach to quantifying the quality of life through the Global Ease of Living Index that combines various socio-economic and infrastructural factors into a single composite score. Our index utilises economic indicators that define living standards, which could help in targeted interventions to improve specific areas. We present a machine learning framework for addressing the problem of missing data for some of the economic indicators for specific countries. We then curate and update the data and use a dimensionality reduction approach (principal component analysis) to create the Ease of Living Index for major economies since 1970. Our work significantly adds to the literature by offering a practical tool for policymakers to identify areas needing improvement, such as healthcare systems, employment opportunities, and public safety. Our approach with open data and code can be easily reproduced and applied to various contexts. This transparency and accessibility make our work a valuable resource for ongoing research and policy development in quality-of-life assessment.

LGJan 20, 2025
A Machine Learning Framework for Handling Unreliable Absence Label and Class Imbalance for Marine Stinger Beaching Prediction

Amuche Ibenegbu, Amandine Schaeffer, Pierre Lafaye de Micheaux et al.

Bluebottles (\textit{Physalia} spp.) are marine stingers resembling jellyfish, whose presence on Australian beaches poses a significant public risk due to their venomous nature. Understanding the environmental factors driving bluebottles ashore is crucial for mitigating their impact, and machine learning tools are to date relatively unexplored. We use bluebottle marine stinger presence/absence data from beaches in Eastern Sydney, Australia, and compare machine learning models (Multilayer Perceptron, Random Forest, and XGBoost) to identify factors influencing their presence. We address challenges such as class imbalance, class overlap, and unreliable absence data by employing data augmentation techniques, including the Synthetic Minority Oversampling Technique (SMOTE), Random Undersampling, and Synthetic Negative Approach that excludes the negative class. Our results show that SMOTE failed to resolve class overlap, but the presence-focused approach effectively handled imbalance, class overlap, and ambiguous absence data. The data attributes such as the wind direction, which is a circular variable, emerged as a key factor influencing bluebottle presence, confirming previous inference studies. However, in the absence of population dynamics, biological behaviours, and life cycles, the best predictive model appears to be Random Forests combined with Synthetic Negative Approach. This research contributes to mitigating the risks posed by bluebottles to beachgoers and provides insights into handling class overlap and unreliable negative class in environmental modelling.

DCJan 18, 2022
Surrogate-assisted distributed swarm optimisation for computationally expensive geoscientific models

Rohitash Chandra, Yash Vardhan Sharma

Evolutionary algorithms provide gradient-free optimisation which is beneficial for models that have difficulty in obtaining gradients; for instance, geoscientific landscape evolution models. However, such models are at times computationally expensive and even distributed swarm-based optimisation with parallel computing struggles. We can incorporate efficient strategies such as surrogate-assisted optimisation to address the challenges; however, implementing inter-process communication for surrogate-based model training is difficult. In this paper, we implement surrogate-based estimation of fitness evaluation in distributed swarm optimisation over a parallel computing architecture. We first test the framework on a set of benchmark optimisation problems and then apply it to a geoscientific model that features a landscape evolution model. Our results demonstrate very promising results for benchmark functions and the Badlands landscape evolution model. We obtain a reduction in computational time while retaining optimisation solution accuracy through the use of surrogates in a parallel computing environment. The major contribution of the paper is in the application of surrogate-based optimisation for geoscientific models which can in the future help in a better understanding of paleoclimate and geomorphology.

CLJan 9, 2022
Semantic and sentiment analysis of selected Bhagavad Gita translations using BERT-based language framework

Rohitash Chandra, Venkatesh Kulkarni

It is well known that translations of songs and poems not only break rhythm and rhyming patterns, but can also result in loss of semantic information. The Bhagavad Gita is an ancient Hindu philosophical text originally written in Sanskrit that features a conversation between Lord Krishna and Arjuna prior to the Mahabharata war. The Bhagavad Gita is also one of the key sacred texts in Hinduism and is known as the forefront of the Vedic corpus of Hinduism. In the last two centuries, there has been a lot of interest in Hindu philosophy from western scholars; hence, the Bhagavad Gita has been translated in a number of languages. However, there is not much work that validates the quality of the English translations. Recent progress of language models powered by deep learning has enabled not only translations but a better understanding of language and texts with semantic and sentiment analysis. Our work is motivated by the recent progress of language models powered by deep learning methods. In this paper, we present a framework that compares selected translations (from Sanskrit to English) of the Bhagavad Gita using semantic and sentiment analyses. We use hand-labelled sentiment dataset for tuning state-of-art deep learning-based language model known as bidirectional encoder representations from transformers (BERT). We provide sentiment and semantic analysis for selected chapters and verses across translations. Our results show that although the style and vocabulary in the respective translations vary widely, the sentiment analysis and semantic similarity shows that the message conveyed are mostly similar.

LGAug 6, 2021
SMOTified-GAN for class imbalanced pattern classification problems

Anuraganand Sharma, Prabhat Kumar Singh, Rohitash Chandra

Class imbalance in a dataset is a major problem for classifiers that results in poor prediction with a high true positive rate (TPR) but a low true negative rate (TNR) for a majority positive training dataset. Generally, the pre-processing technique of oversampling of minority class(es) are used to overcome this deficiency. Our focus is on using the hybridization of Generative Adversarial Network (GAN) and Synthetic Minority Over-Sampling Technique (SMOTE) to address class imbalanced problems. We propose a novel two-phase oversampling approach involving knowledge transfer that has the synergy of SMOTE and GAN. The unrealistic or overgeneralized samples of SMOTE are transformed into realistic distribution of data by GAN where there is not enough minority class data available for GAN to process them by itself effectively. We named it SMOTified-GAN as GAN works on pre-sampled minority data produced by SMOTE rather than randomly generating the samples itself. The experimental results prove the sample quality of minority class(es) has been improved in a variety of tested benchmark datasets. Its performance is improved by up to 9\% from the next best algorithm tested on F1-score measurements. Its time complexity is also reasonable which is around $O(N^2d^2T)$ for a sequential algorithm.

LGApr 17, 2021
Bayesian graph convolutional neural networks via tempered MCMC

Rohitash Chandra, Ayush Bhagat, Manavendra Maharana et al.

Deep learning models, such as convolutional neural networks, have long been applied to image and multi-media tasks, particularly those with structured data. More recently, there has been more attention to unstructured data that can be represented via graphs. These types of data are often found in health and medicine, social networks, and research data repositories. Graph convolutional neural networks have recently gained attention in the field of deep learning that takes advantage of graph-based data representation with automatic feature extraction via convolutions. Given the popularity of these methods in a wide range of applications, robust uncertainty quantification is vital. This remains a challenge for large models and unstructured datasets. Bayesian inference provides a principled approach to uncertainty quantification of model parameters for deep learning models. Although Bayesian inference has been used extensively elsewhere, its application to deep learning remains limited due to the computational requirements of the Markov Chain Monte Carlo (MCMC) methods. Recent advances in parallel computing and advanced proposal schemes in MCMC sampling methods has opened the path for Bayesian deep learning. In this paper, we present Bayesian graph convolutional neural networks that employ tempered MCMC sampling with Langevin-gradient proposal distribution implemented via parallel computing. Our results show that the proposed method can provide accuracy similar to advanced optimisers while providing uncertainty quantification for key benchmark problems.

LGApr 13, 2021
Revisiting Bayesian Autoencoders with MCMC

Rohitash Chandra, Mahir Jain, Manavendra Maharana et al.

Autoencoders gained popularity in the deep learning revolution given their ability to compress data and provide dimensionality reduction. Although prominent deep learning methods have been used to enhance autoencoders, the need to provide robust uncertainty quantification remains a challenge. This has been addressed with variational autoencoders so far. Bayesian inference via Markov Chain Monte Carlo (MCMC) sampling has faced several limitations for large models; however, recent advances in parallel computing and advanced proposal schemes have opened routes less traveled. This paper presents Bayesian autoencoders powered by MCMC sampling implemented using parallel computing and Langevin-gradient proposal distribution. The results indicate that the proposed Bayesian autoencoder provides similar performance accuracy when compared to related methods in the literature. Furthermore, it provides uncertainty quantification in the reduced data representation. This motivates further applications of the Bayesian autoencoder framework for other deep learning models.