CVJan 19
Early Prediction of Type 2 Diabetes Using Multimodal data and Tabular TransformersSulaiman Khan, Md. Rafiul Biswas, Zubair Shah
This study introduces a novel approach for early Type 2 Diabetes Mellitus (T2DM) risk prediction using a tabular transformer (TabTrans) architecture to analyze longitudinal patient data. By processing patients` longitudinal health records and bone-related tabular data, our model captures complex, long-range dependencies in disease progression that conventional methods often overlook. We validated our TabTrans model on a retrospective Qatar BioBank (QBB) cohort of 1,382 subjects, comprising 725 men (146 diabetic, 579 healthy) and 657 women (133 diabetic, 524 healthy). The study integrated electronic health records (EHR) with dual-energy X-ray absorptiometry (DXA) data. To address class imbalance, we employed SMOTE and SMOTE-ENN resampling techniques. The proposed model`s performance is evaluated against conventional machine learning (ML) and generative AI models, including Claude 3.5 Sonnet (Anthropic`s constitutional AI), GPT-4 (OpenAI`s generative pre-trained transformer), and Gemini Pro (Google`s multimodal language model). Our TabTrans model demonstrated superior predictive performance, achieving ROC AUC $\geq$ 79.7 % for T2DM prediction compared to both generative AI models and conventional ML approaches. Feature interpretation analysis identified key risk indicators, with visceral adipose tissue (VAT) mass and volume, ward bone mineral density (BMD) and bone mineral content (BMC), T and Z-scores, and L1-L4 scores emerging as the most important predictors associated with diabetes development in Qatari adults. These findings demonstrate the significant potential of TabTrans for analyzing complex tabular healthcare data, providing a powerful tool for proactive T2DM management and personalized clinical interventions in the Qatari population. Index Terms: tabular transformers, multimodal data, DXA data, diabetes, T2DM, feature interpretation, tabular data
QMMay 28, 2025
Improving statistical learning methods via features selection without replacement sampling and random projectionSulaiman khan, Muhammad Ahmad, Fida Ullah et al.
Cancer is fundamentally a genetic disease characterized by genetic and epigenetic alterations that disrupt normal gene expression, leading to uncontrolled cell growth and metastasis. High-dimensional microarray datasets pose challenges for classification models due to the "small n, large p" problem, resulting in overfitting. This study makes three different key contributions: 1) we propose a machine learning-based approach integrating the Feature Selection Without Re-placement (FSWOR) technique and a projection method to improve classification accuracy. 2) We apply the Kendall statistical test to identify the most significant genes from the brain cancer mi-croarray dataset (GSE50161), reducing the feature space from 54,675 to 20,890 genes.3) we apply machine learning models using k-fold cross validation techniques in which our model incorpo-rates ensemble classifiers with LDA projection and Naïve Bayes, achieving a test score of 96%, outperforming existing methods by 9.09%. The results demonstrate the effectiveness of our ap-proach in high-dimensional gene expression analysis, improving classification accuracy while mitigating overfitting. This study contributes to cancer biomarker discovery, offering a robust computational method for analyzing microarray data.
IVJun 2, 2024
An Early Investigation into the Utility of Multimodal Large Language Models in Medical ImagingSulaiman Khan, Md. Rafiul Biswas, Alina Murad et al.
Recent developments in multimodal large language models (MLLMs) have spurred significant interest in their potential applications across various medical imaging domains. On the one hand, there is a temptation to use these generative models to synthesize realistic-looking medical image data, while on the other hand, the ability to identify synthetic image data in a pool of data is also significantly important. In this study, we explore the potential of the Gemini (\textit{gemini-1.0-pro-vision-latest}) and GPT-4V (gpt-4-vision-preview) models for medical image analysis using two modalities of medical image data. Utilizing synthetic and real imaging data, both Gemini AI and GPT-4V are first used to classify real versus synthetic images, followed by an interpretation and analysis of the input images. Experimental results demonstrate that both Gemini and GPT-4 could perform some interpretation of the input images. In this specific experiment, Gemini was able to perform slightly better than the GPT-4V on the classification task. In contrast, responses associated with GPT-4V were mostly generic in nature. Our early investigation presented in this work provides insights into the potential of MLLMs to assist with the classification and interpretation of retinal fundoscopy and lung X-ray images. We also identify key limitations associated with the early investigation study on MLLMs for specialized tasks in medical image analysis.
NIOct 7, 2021
Highly Accurate and Reliable Wireless Network Slicing in 5th Generation Networks: A Hybrid Deep Learning ApproachSulaiman Khan, Suleman Khan, Yasir Ali et al.
In the current era, the next-generation networks like 5th generation (5G) and 6th generation (6G) networks require high security, low latency with a high reliable standards and capacity. In these networks, reconfigurable wireless network slicing is considered as one of the key elements for 5G and 6G networks. A reconfigurable slicing allows the operators to run various instances of the network using a single infrastructure for a better quality of services (QoS). The QoS can be achieved by reconfiguring and optimizing these networks using Artificial intelligence and machine learning algorithms. To develop a smart decision-making mechanism for network management and restricting network slice failures, machine learning-enabled reconfigurable wireless network solutions are required. In this paper, we propose a hybrid deep learning model that consists of a convolution neural network (CNN) and long short term memory (LSTM). The CNN performs resource allocation, network reconfiguration, and slice selection while the LSTM is used for statistical information (load balancing, error rate etc.) regarding network slices. The applicability of the proposed model is validated by using multiple unknown devices, slice failure, and overloading conditions. The overall accuracy of 95.17% is achieved by the proposed model that reflects its applicability.
CVMay 27, 2019
An Intelligent Monitoring System of Vehicles on Highway TrafficSulaiman Khan, Hazrat Ali, Zia Ullah et al.
Vehicle speed monitoring and management of highways is the critical problem of the road in this modern age of growing technology and population. A poor management results in frequent traffic jam, traffic rules violation and fatal road accidents. Using traditional techniques of RADAR, LIDAR and LASAR to address this problem is time-consuming, expensive and tedious. This paper presents an efficient framework to produce a simple, cost efficient and intelligent system for vehicle speed monitoring. The proposed method uses an HD (High Definition) camera mounted on the road side either on a pole or on a traffic signal for recording video frames. On the basis of these frames, a vehicle can be tracked by using radius growing method, and its speed can be calculated by calculating vehicle mask and its displacement in consecutive frames. The method uses pattern recognition, digital image processing and mathematical techniques for vehicle detection, tracking and speed calculation. The validity of the proposed model is proved by testing it on different highways.
CVApr 6, 2019
KNN and ANN-based Recognition of Handwritten Pashto Letters using Zoning FeaturesSulaiman Khan, Hazrat Ali, Zahid Ullah et al.
This paper presents a recognition system for handwritten Pashto letters. However, handwritten character recognition is a challenging task. These letters not only differ in shape and style but also vary among individuals. The recognition becomes further daunting due to the lack of standard datasets for inscribed Pashto letters. In this work, we have designed a database of moderate size, which encompasses a total of 4488 images, stemming from 102 distinguishing samples for each of the 44 letters in Pashto. The recognition framework uses zoning feature extractor followed by K-Nearest Neighbour (KNN) and Neural Network (NN) classifiers for classifying individual letter. Based on the evaluation of the proposed system, an overall classification accuracy of approximately 70.05% is achieved by using KNN while 72% is achieved by using NN.