M Saifur Rahman

LG
h-index15
5papers
14citations
Novelty49%
AI Score36

5 Papers

IVAug 3, 2023
Unmasking Parkinson's Disease with Smile: An AI-enabled Screening Framework

Tariq Adnan, Md Saiful Islam, Wasifur Rahman et al.

We present an efficient and accessible PD screening method by leveraging AI-driven models enabled by the largest video dataset of facial expressions from 1,059 unique participants. This dataset includes 256 individuals with PD, 165 clinically diagnosed, and 91 self-reported. Participants used webcams to record themselves mimicking three facial expressions (smile, disgust, and surprise) from diverse sources encompassing their homes across multiple countries, a US clinic, and a PD wellness center in the US. Facial landmarks are automatically tracked from the recordings to extract features related to hypomimia, a prominent PD symptom characterized by reduced facial expressions. Machine learning algorithms are trained on these features to distinguish between individuals with and without PD. The model was tested for generalizability on external (unseen during training) test videos collected from a US clinic and Bangladesh. An ensemble of machine learning models trained on smile videos achieved an accuracy of 87.9+-0.1% (95% Confidence Interval) with an AUROC of 89.3+-0.3% as evaluated on held-out data (using k-fold cross-validation). In external test settings, the ensemble model achieved 79.8+-0.6% accuracy with 81.9+-0.3% AUROC on the clinical test set and 84.9+-0.4% accuracy with 81.2+-0.6% AUROC on participants from Bangladesh. In every setting, the model was free from detectable bias across sex and ethnic subgroups, except in the cohorts from Bangladesh, where the model performed significantly better for female participants than males. Smiling videos can effectively differentiate between individuals with and without PD, offering a potentially easy, accessible, and cost-efficient way to screen for PD, especially when a clinical diagnosis is difficult to access.

LGNov 3, 2024
Privacy-Preserving Customer Churn Prediction Model in the Context of Telecommunication Industry

Joydeb Kumar Sana, M Sohel Rahman, M Saifur Rahman

Data is the main fuel of a successful machine learning model. A dataset may contain sensitive individual records e.g. personal health records, financial data, industrial information, etc. Training a model using this sensitive data has become a new privacy concern when someone uses third-party cloud computing. Trained models also suffer privacy attacks which leads to the leaking of sensitive information of the training data. This study is conducted to preserve the privacy of training data in the context of customer churn prediction modeling for the telecommunications industry (TCI). In this work, we propose a framework for privacy-preserving customer churn prediction (PPCCP) model in the cloud environment. We have proposed a novel approach which is a combination of Generative Adversarial Networks (GANs) and adaptive Weight-of-Evidence (aWOE). Synthetic data is generated from GANs, and aWOE is applied on the synthetic training dataset before feeding the data to the classification algorithms. Our experiments were carried out using eight different machine learning (ML) classifiers on three openly accessible datasets from the telecommunication sector. We then evaluated the performance using six commonly employed evaluation metrics. In addition to presenting a data privacy analysis, we also performed a statistical significance test. The training and prediction processes achieve data privacy and the prediction classifiers achieve high prediction performance (87.1\% in terms of F-Measure for GANs-aWOE based Naïve Bayes model). In contrast to earlier studies, our suggested approach demonstrates a prediction enhancement of up to 28.9\% and 27.9\% in terms of accuracy and F-measure, respectively.

AISep 16, 2025
The Art of Saying "Maybe": A Conformal Lens for Uncertainty Benchmarking in VLMs

Asif Azad, Mohammad Sadat Hossain, MD Sadik Hossain Shanto et al.

Vision-Language Models (VLMs) have achieved remarkable progress in complex visual understanding across scientific and reasoning tasks. While performance benchmarking has advanced our understanding of these capabilities, the critical dimension of uncertainty quantification has received insufficient attention. Therefore, unlike prior conformal prediction studies that focused on limited settings, we conduct a comprehensive uncertainty benchmarking study, evaluating 16 state-of-the-art VLMs (open and closed-source) across 6 multimodal datasets with 3 distinct scoring functions. Our findings demonstrate that larger models consistently exhibit better uncertainty quantification; models that know more also know better what they don't know. More certain models achieve higher accuracy, while mathematical and reasoning tasks elicit poorer uncertainty performance across all models compared to other domains. This work establishes a foundation for reliable uncertainty evaluation in multimodal systems.

LGJun 8, 2025
Patient Similarity Computation for Clinical Decision Support: An Efficient Use of Data Transformation, Combining Static and Time Series Data

Joydeb Kumar Sana, Mohammad M. Masud, M Sohel Rahman et al.

Patient similarity computation (PSC) is a fundamental problem in healthcare informatics. The aim of the patient similarity computation is to measure the similarity among patients according to their historical clinical records, which helps to improve clinical decision support. This paper presents a novel distributed patient similarity computation (DPSC) technique based on data transformation (DT) methods, utilizing an effective combination of time series and static data. Time series data are sensor-collected patients' information, including metrics like heart rate, blood pressure, Oxygen saturation, respiration, etc. The static data are mainly patient background and demographic data, including age, weight, height, gender, etc. Static data has been used for clustering the patients. Before feeding the static data to the machine learning model adaptive Weight-of-Evidence (aWOE) and Z-score data transformation (DT) methods have been performed, which improve the prediction performances. In aWOE-based patient similarity models, sensitive patient information has been processed using aWOE which preserves the data privacy of the trained models. We used the Dynamic Time Warping (DTW) approach, which is robust and very popular, for time series similarity. However, DTW is not suitable for big data due to the significant computational run-time. To overcome this problem, distributed DTW computation is used in this study. For Coronary Artery Disease, our DT based approach boosts prediction performance by as much as 11.4%, 10.20%, and 12.6% in terms of AUC, accuracy, and F-measure, respectively. In the case of Congestive Heart Failure (CHF), our proposed method achieves performance enhancement up to 15.9%, 10.5%, and 21.9% for the same measures, respectively. The proposed method reduces the computation time by as high as 40%.

CVDec 10, 2023
PULSAR: Graph based Positive Unlabeled Learning with Multi Stream Adaptive Convolutions for Parkinson's Disease Recognition

Md. Zarif Ul Alam, Md Saiful Islam, Ehsan Hoque et al.

Parkinson's disease (PD) is a neuro-degenerative disorder that affects movement, speech, and coordination. Timely diagnosis and treatment can improve the quality of life for PD patients. However, access to clinical diagnosis is limited in low and middle income countries (LMICs). Therefore, development of automated screening tools for PD can have a huge social impact, particularly in the public health sector. In this paper, we present PULSAR, a novel method to screen for PD from webcam-recorded videos of the finger-tapping task from the Movement Disorder Society - Unified Parkinson's Disease Rating Scale (MDS-UPDRS). PULSAR is trained and evaluated on data collected from 382 participants (183 self-reported as PD patients). We used an adaptive graph convolutional neural network to dynamically learn the spatio temporal graph edges specific to the finger-tapping task. We enhanced this idea with a multi stream adaptive convolution model to learn features from different modalities of data critical to detect PD, such as relative location of the finger joints, velocity and acceleration of tapping. As the labels of the videos are self-reported, there could be cases of undiagnosed PD in the non-PD labeled samples. We leveraged the idea of Positive Unlabeled (PU) Learning that does not need labeled negative data. Our experiments show clear benefit of modeling the problem in this way. PULSAR achieved 80.95% accuracy in validation set and a mean accuracy of 71.29% (2.49% standard deviation) in independent test, despite being trained with limited amount of data. This is specially promising as labeled data is scarce in health care sector. We hope PULSAR will make PD screening more accessible to everyone. The proposed techniques could be extended for assessment of other movement disorders, such as ataxia, and Huntington's disease.