Danish Ali

CV
h-index32
8papers
162citations
Novelty40%
AI Score52

8 Papers

CLMay 29
Reliable Multilingual Orthopedic Decision Support from Clinical Narratives: Language-Aware Adaptation and Verification-Guided Deferral

Danish Ali, Li Xiaojian, Sundas Iqbal et al.

Multilingual orthopedic decision support remains challenging in low-resource healthcare settings, where clinical narratives contain specialized terminology, mixed scripts, incomplete evidence, label imbalance and language-dependent documentation patterns. This article presents a reliability-oriented framework for classifying free-text orthopedic notes in English, Hindi and Punjabi. We compare task-aligned multilingual transformer encoders, a task-fine-tuned DistilBERT baseline, zero-shot instruction-tuned large language models (LLMs) and a domain-adaptive encoder, IndicBERT-HPA. IndicBERT-HPA augments IndicBERT with language-aware orthopedic adapter heads to support clinically relevant multilingual representation learning. Evaluation extends beyond aggregate accuracy to per-class performance, ROC-AUC, AUPRC, expected calibration error, cross-language stability and robustness under controlled balanced and natural-prevalence distributions. The evaluated zero-shot LLMs remain substantially less effective than task-adapted encoders for closed-set classification, with language-dependent instability. Under natural clinical prevalence, IndicBERT-HPA achieves the strongest overall performance, reaching an averaged Macro-F1 of 0.8792, Macro-AUROC of 0.894 and AUPRC of 0.902. We further implement a deterministic selective-verification layer combining confidence gating, evidence-consistency checking and language-risk screening. On a randomly selected held-out 5,000-record subset, it achieves 84.4% selective accuracy and 0.76 selective Macro-F1 at 72.3% coverage, compared with 71.5% accuracy and 0.65 Macro-F1 for accept-all prediction. These results support reliability-oriented multilingual clinical decision support with explicit deferral.

CVMay 21
D3Seg: Dependency-Aware Diffusion for Brain Tumor Segmentation with Missing Modalities

Danish Ali, Ajmal Mian, Naveed Akhtar et al.

Accurate brain tumor segmentation using multiparametric MRI is critical for effective treatment planning. However, in clinical settings, complete acquisition of all MRI sequences is not always possible. The absence of certain MRI modalities results in substantial performance degradation in existing segmentation methods, which typically rely on naive feature concatenation or direct fusion strategies. To address this limitation, we propose a novel segmentation model D3Seg which is designed to maintain stable performance under missing-modality settings. D3Seg introduces Multi-hop Modality Graph Fusion (MMGF) to model higher order inter-modality dependencies, a lightweight diffusion-based imputation mechanism to compensate for missing T1ce representations in latent space, and probability-space decision refinement to mitigate dominant class overconfidence and improve delineation of underrepresented tumor subregions. Extensive evaluation on BraTS 2023 dataset demonstrates that our D3Seg model consistently improves segmentation performance under missing modality configurations. The proposed model achieves approximately 1.5-2.0% Dice improvement on enhancing tumor (ET) and around 1.0% on tumor core (TC) across multiple missing modality configurations compared to the current state-of-the-art model, while maintaining computational efficiency.

CLMay 4
Reliability-Oriented Multilingual Orthopedic Diagnosis: A Domain-Adaptive Modeling and a Conceptual Validation Framework

Danish Ali, Li Xiaojian, Sundas Iqbal et al.

Large Language Models (LLMs) are increasingly proposed for clinical decision support including multilingual diagnosis in low-resource settings. However, their reliability, calibration and safety characteristics remain insufficiently understood for structured, high-risk tasks. We present a system-level analysis of multilingual orthopedic diagnosis from free-text clinical notes in English, Hindi and Punjabi. We evaluate three modeling regimes: (i) task-aligned multilingual transformer encoders, (ii) a task-fine-tuned baseline (DistilBERT), and (iii) a domain-adaptive architecture tailored to orthopedic text (IndicBERT-HPA). These models are compared with zero-shot, instruction-tuned LLMs to assess suitability for structured diagnostic classification. Results indicate that while LLMs exhibit strong linguistic fluency, they show unstable calibration and reduced reliability under structured multilingual conditions, particularly in low-resource languages. These findings are specific to zero-shot evaluation and do not imply limitations of fine-tuned models. Domain-adaptive specialization substantially improves cross-lingual discrimination and confidence behavior. IndicBERT-HPA, with language-specific orthopedic adapter heads achieves consistently strong performance across six diagnostic categories and more predictable deployment characteristics than task-only adaptation. Building on these observations, we outline a conceptual deterministic agent-based validation framework for future implementation, formalizing evidence checks, language-sensitive validation and conservative human-in-the-loop gating. Reliable multilingual clinical decision support requires specialized architecture, explicit reliability analysis, and structured validation for safety-critical systems.

CVApr 30
AIDA-ReID: Adaptive Intermediate Domain Adaptation for Generalizable and Source-Free Person Re-Identification

Sundas Iqbal, Qing Tian, Danish Ali et al.

Person re-identification (Re-ID) aims to match images of the same individual across non-overlapping camera views and remains challenging due to domain shifts caused by variations in illumination, background, camera characteristics, and population distributions. Although supervised models perform well under matched training and testing conditions, their performance degrades significantly when deployed in unseen environments. Existing intermediate domain approaches such as IDM and IDM++ alleviate this gap by constructing bridge feature distributions between domains; however, they rely on fixed mixing strategies and joint source-target access, limiting their applicability to multi-source and source-free settings. To address these limitations, this paper proposes Adaptive Intermediate Domain Adaptation (AIDA), also referred to as Source-Free Multi-Source Intermediate Domain Adaptation (SF-MIDA). The proposed framework treats intermediate-domain learning as a dynamically regulated process, where feature mixing and regularization strength are adaptively controlled using feedback signals derived from model uncertainty and training stability. A multi-source intermediate domain generator synthesizes diverse intermediate representations, while a pseudo-mirror regularization strategy preserves identity consistency under domain perturbations. Extensive experiments across domain generalization and source-free settings demonstrate the effectiveness of the proposed framework.

CVOct 16, 2025
DRBD-Mamba for Robust and Efficient Brain Tumor Segmentation with Analytical Insights

Danish Ali, Ajmal Mian, Naveed Akhtar et al.

Accurate brain tumor segmentation is significant for clinical diagnosis and treatment but remains challenging due to tumor heterogeneity. Mamba-based State Space Models have demonstrated promising performance. However, despite their computational efficiency over other neural architectures, they incur considerable overhead for this task due to their sequential feature computation across multiple spatial axes. Moreover, their robustness across diverse BraTS data partitions remains largely unexplored, leaving a critical gap in reliable evaluation. To address this, we first propose a dual-resolution bi-directional Mamba (DRBD-Mamba), an efficient 3D segmentation model that captures multi-scale long-range dependencies with minimal computational overhead. We leverage a space-filling curve to preserve spatial locality during 3D-to-1D feature mapping, thereby reducing reliance on computationally expensive multi-axial feature scans. To enrich feature representation, we propose a gated fusion module that adaptively integrates forward and reverse contexts, along with a quantization block that improves robustness. We further propose five systematic folds on BraTS2023 for rigorous evaluation of segmentation techniques under diverse conditions and present analysis of common failure scenarios. On the 20% test set used by recent methods, our model achieves Dice improvements of 0.10% for whole tumor, 1.75% for tumor core, and 0.93% for enhancing tumor. Evaluations on the proposed systematic folds demonstrate that our model maintains competitive whole tumor accuracy while achieving clear average Dice gains of 1.16% for tumor core and 1.68% for enhancing tumor over existing state-of-the-art. Furthermore, our model achieves a 15x efficiency improvement while maintaining high segmentation accuracy, highlighting its robustness and computational advantage over existing methods.

HCApr 27, 2020
Influence of Hand Tracking as a way of Interaction in Virtual Reality on User Experience

Jan-Niklas Voigt-Antons, Tanja Kojić, Danish Ali et al.

With the rising interest in Virtual Reality and the fast development and improvement of available devices, new features of interactions are becoming available. One of them that is becoming very popular is hand tracking, as the idea to replace controllers for interactions in virtual worlds. This experiment aims to compare different interaction types in VR using either controllers or hand tracking. Participants had to play two simple VR games with various types of tasks in those games - grabbing objects or typing numbers. While playing, they were using interactions with different visualizations of hands and controllers. The focus of this study was to investigate user experience of varying interactions (controller vs. hand tracking) for those two simple tasks. Results show that different interaction types statistically significantly influence reported emotions with Self-Assessment Manikin (SAM), where for hand tracking participants were feeling higher valence, but lower arousal and dominance. Additionally, task type of grabbing was reported to be more realistic, and participants experienced a higher presence. Surprisingly, participants rated the interaction type with controllers where both where hands and controllers were visualized as statistically most preferred. Finally, hand tracking for both tasks was rated with the System Usability Scale (SUS) scale, and hand tracking for the task typing was rated as statistically significantly more usable. These results can drive further research and, in the long term, contribute to help selecting the most matching interaction modality for a task.

MMApr 3, 2020
User Experience of Reading in Virtual Reality -- Finding Values for Text Distance, Size and Contrast

Tanja Kojić, Danish Ali, Robert Greinacher et al.

Virtual Reality (VR) has an increasing impact on the market in many fields, from education and medicine to engineering and entertainment, by creating different applications that replicate or in the case of augmentation enhance real-life scenarios. Intending to present realistic environments, VR applications are including text that we are surrounded by every day. However, text can only add value to the virtual environment if it is designed and created in such a way that users can comfortably read it. With the aim to explore what values for text parameters users find comfortable while reading in virtual reality, a study was conducted allowing participants to manipulate text parameters such as font size, distance, and contrast. Therefore two different standalone virtual reality devices were used, Oculus Go and Quest, together with three different text samples: Short (2 words), medium (21 words), and long (51 words). Participants had the task of setting text parameters to the best and worst possible value. Additionally, participants were asked to rate their experience of reading in virtual reality. Results report mean values for angular size (the combination of distance and font size) and color contrast depending on the different device used as well as the varying text length, for both tasks. Significant differences were found for values of angular size, depending on the length of the displayed text. However, different device types had no significant influence on text parameters but on the experiences reported using the self-assessment manikin (SAM) scale.

MMApr 3, 2020
Comparing emotional states induced by 360$^{\circ}$ videos via head-mounted display and computer screen

Jan-Niklas Voigt-Antons, Eero Lehtonen, Andres Pinilla Palacios et al.

In recent years 360$^{\circ}$ videos have been becoming more popular. For traditional media presentations, e.g., on a computer screen, a wide range of assessment methods are available. Different constructs, such as perceived quality or the induced emotional state of viewers, can be reliably assessed by subjective scales. Many of the subjective methods have only been validated using stimuli presented on a computer screen. This paper is using 360$^{\circ}$ videos to induce varying emotional states. Videos were presented 1) via a head-mounted display (HMD) and 2) via a traditional computer screen. Furthermore, participants were asked to rate their emotional state 1) in retrospect on the self-assessment manikin scale and 2) continuously on a 2-dimensional arousal-valence plane. In a repeated measures design, all participants (N = 18) used both presentation systems and both rating systems. Results indicate that there is a statistically significant difference in induced presence due to the presentation system. Furthermore, there was no statistically significant difference in ratings gathered with the two presentation systems. Finally, it was found that for arousal measures, a statistically significant difference could be found for the different rating methods, potentially indicating an underestimation of arousal ratings gathered in retrospect for screen presentation. In the future, rating methods such as a 2-dimensional arousal-valence plane could offer the advantage of enabling a reliable measurement of emotional states while being more embedded in the experience itself, enabling a more precise capturing of the emotional states.