Rabeya Tus Sadia

LG
h-index7
4papers
8citations
Novelty38%
AI Score36

4 Papers

IVJun 2, 2022
Examining the behaviour of state-of-the-art convolutional neural networks for brain tumor detection with and without transfer learning

Md. Atik Ahamed, Rabeya Tus Sadia

Distinguishing normal from malignant and determining the tumor type are critical components of brain tumor diagnosis. Two different kinds of dataset are investigated using state-of-the-art CNN models in this research work. One dataset(binary) has images of normal and tumor types, while another(multi-class) provides all images of tumors classified as glioma, meningioma, or pituitary. The experiments were conducted in these dataset with transfer learning from pre-trained weights from ImageNet as well as initializing the weights randomly. The experimental environment is equivalent for all models in this study in order to make a fair comparison. For both of the dataset, the validation set are same for all the models where train data is 60% while the rest is 40% for validation. With the proposed techniques in this research, the EfficientNet-B5 architecture outperforms all the state-of-the-art models in the binary-classification dataset with the accuracy of 99.75% and 98.61% accuracy for the multi-class dataset. This research also demonstrates the behaviour of convergence of validation loss in different weight initialization techniques.

CVFeb 11, 2025
CausalGeD: Blending Causality and Diffusion for Spatial Gene Expression Generation

Rabeya Tus Sadia, Md Atik Ahamed, Qiang Cheng

The integration of single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics (ST) data is crucial for understanding gene expression in spatial context. Existing methods for such integration have limited performance, with structural similarity often below 60\%, We attribute this limitation to the failure to consider causal relationships between genes. We present CausalGeD, which combines diffusion and autoregressive processes to leverage these relationships. By generalizing the Causal Attention Transformer from image generation to gene expression data, our model captures regulatory mechanisms without predefined relationships. Across 10 tissue datasets, CausalGeD outperformed state-of-the-art baselines by 5- 32\% in key metrics, including Pearson's correlation and structural similarity, advancing both technical and biological insights.

LGSep 12, 2025
CrunchLLM: Multitask LLMs for Structured Business Reasoning and Outcome Prediction

Rabeya Tus Sadia, Qiang Cheng

Predicting the success of start-up companies, defined as achieving an exit through acquisition or IPO, is a critical problem in entrepreneurship and innovation research. Datasets such as Crunchbase provide both structured information (e.g., funding rounds, industries, investor networks) and unstructured text (e.g., company descriptions), but effectively leveraging this heterogeneous data for prediction remains challenging. Traditional machine learning approaches often rely only on structured features and achieve moderate accuracy, while large language models (LLMs) offer rich reasoning abilities but struggle to adapt directly to domain-specific business data. We present \textbf{CrunchLLM}, a domain-adapted LLM framework for startup success prediction. CrunchLLM integrates structured company attributes with unstructured textual narratives and applies parameter-efficient fine-tuning strategies alongside prompt optimization to specialize foundation models for entrepreneurship data. Our approach achieves accuracy exceeding 80\% on Crunchbase startup success prediction, significantly outperforming traditional classifiers and baseline LLMs. Beyond predictive performance, CrunchLLM provides interpretable reasoning traces that justify its predictions, enhancing transparency and trustworthiness for financial and policy decision makers. This work demonstrates how adapting LLMs with domain-aware fine-tuning and structured--unstructured data fusion can advance predictive modeling of entrepreneurial outcomes. CrunchLLM contributes a methodological framework and a practical tool for data-driven decision making in venture capital and innovation policy.

LGJul 31, 2025
DepMicroDiff: Diffusion-Based Dependency-Aware Multimodal Imputation for Microbiome Data

Rabeya Tus Sadia, Qiang Cheng

Microbiome data analysis is essential for understanding host health and disease, yet its inherent sparsity and noise pose major challenges for accurate imputation, hindering downstream tasks such as biomarker discovery. Existing imputation methods, including recent diffusion-based models, often fail to capture the complex interdependencies between microbial taxa and overlook contextual metadata that can inform imputation. We introduce DepMicroDiff, a novel framework that combines diffusion-based generative modeling with a Dependency-Aware Transformer (DAT) to explicitly capture both mutual pairwise dependencies and autoregressive relationships. DepMicroDiff is further enhanced by VAE-based pretraining across diverse cancer datasets and conditioning on patient metadata encoded via a large language model (LLM). Experiments on TCGA microbiome datasets show that DepMicroDiff substantially outperforms state-of-the-art baselines, achieving higher Pearson correlation (up to 0.712), cosine similarity (up to 0.812), and lower RMSE and MAE across multiple cancer types, demonstrating its robustness and generalizability for microbiome imputation.