CVAug 24, 2024Code
Learning from the few: Fine-grained approach to pediatric wrist pathology recognition on a limited datasetAmmar Ahmed, Ali Shariq Imran, Zenun Kastrati et al.
Wrist pathologies, {particularly fractures common among children and adolescents}, present a critical diagnostic challenge. While X-ray imaging remains a prevalent diagnostic tool, the increasing misinterpretation rates highlight the need for more accurate analysis, especially considering the lack of specialized training among many surgeons and physicians. Recent advancements in deep convolutional neural networks offer promise in automating pathology detection in trauma X-rays. However, distinguishing subtle variations between {pediatric} wrist pathologies in X-rays remains challenging. Traditional manual annotation, though effective, is laborious, costly, and requires specialized expertise. {In this paper, we address the challenge of pediatric wrist pathology recognition with a fine-grained approach, aimed at automatically identifying discriminative regions in X-rays without manual intervention. We refine our fine-grained architecture through ablation analysis and the integration of LION.} Leveraging Grad-CAM, an explainable AI technique, we highlight these regions. Despite using limited data, reflective of real-world medical study constraints, our method consistently outperforms state-of-the-art image recognition models on both augmented and original (challenging) test sets. {Our proposed refined architecture achieves an increase in accuracy of 1.06% and 1.25% compared to the baseline method, resulting in accuracies of 86% and 84%, respectively. Moreover, our approach demonstrates the highest fracture sensitivity of 97%, highlighting its potential to enhance wrist pathology recognition. The implementation code can be found at https://github.com/ammarlodhi255/fine-grained-approach-to-wrist-pathology-recognition
CVJul 17, 2024
Enhancing Wrist Fracture Detection with YOLOAmmar Ahmed, Ali Shariq Imran, Abdul Manaf et al.
Diagnosing and treating abnormalities in the wrist, specifically distal radius, and ulna fractures, is a crucial concern among children, adolescents, and young adults, with a higher incidence rate during puberty. However, the scarcity of radiologists and the lack of specialized training among medical professionals pose a significant risk to patient care. This problem is further exacerbated by the rising number of imaging studies and limited access to specialist reporting in certain regions. This highlights the need for innovative solutions to improve the diagnosis and treatment of wrist abnormalities. Automated wrist fracture detection using object detection has shown potential, but current studies mainly use two-stage detection methods with limited evidence for single-stage effectiveness. This study employs state-of-the-art single-stage deep neural network-based detection models YOLOv5, YOLOv6, YOLOv7, and YOLOv8 to detect wrist abnormalities. Through extensive experimentation, we found that these YOLO models outperform the commonly used two-stage detection algorithm, Faster R-CNN, in fracture detection. Additionally, compound-scaled variants of each YOLO model were compared, with YOLOv8m demonstrating a highest fracture detection sensitivity of 0.92 and mean average precision (mAP) of 0.95. On the other hand, YOLOv6m achieved the highest sensitivity across all classes at 0.83. Meanwhile, YOLOv8x recorded the highest mAP of 0.77 for all classes on the GRAZPEDWRI-DX pediatric wrist dataset, highlighting the potential of single-stage models for enhancing pediatric wrist imaging.
CVSep 7, 2024
Metadata augmented deep neural networks for wild animal classificationAslak Tøn, Ammar Ahmed, Ali Shariq Imran et al.
Camera trap imagery has become an invaluable asset in contemporary wildlife surveillance, enabling researchers to observe and investigate the behaviors of wild animals. While existing methods rely solely on image data for classification, this may not suffice in cases of suboptimal animal angles, lighting, or image quality. This study introduces a novel approach that enhances wild animal classification by combining specific metadata (temperature, location, time, etc) with image data. Using a dataset focused on the Norwegian climate, our models show an accuracy increase from 98.4% to 98.9% compared to existing methods. Notably, our approach also achieves high accuracy with metadata-only classification, highlighting its potential to reduce reliance on image quality. This work paves the way for integrated systems that advance wildlife classification technology.
HCJul 6, 2025
The role of large language models in UI/UX design: A systematic literature reviewAmmar Ahmed, Ali Shariq Imran
This systematic literature review examines the role of large language models (LLMs) in UI/UX design, synthesizing findings from 38 peer-reviewed studies published between 2022 and 2025. We identify key LLMs in use, including GPT-4, Gemini, and PaLM, and map their integration across the design lifecycle, from ideation to evaluation. Common practices include prompt engineering, human-in-the-loop workflows, and multimodal input. While LLMs are reshaping design processes, challenges such as hallucination, prompt instability, and limited explainability persist. Our findings highlight LLMs as emerging collaborators in design, and we propose directions for the ethical, inclusive, and effective integration of these technologies.
CVDec 18, 2024
Navigating limitations with precision: A fine-grained ensemble approach to wrist pathology recognition on a limited x-ray datasetAmmar Ahmed, Ali Shariq Imran, Mohib Ullah et al.
The exploration of automated wrist fracture recognition has gained considerable research attention in recent years. In practical medical scenarios, physicians and surgeons may lack the specialized expertise required for accurate X-ray interpretation, highlighting the need for machine vision to enhance diagnostic accuracy. However, conventional recognition techniques face challenges in discerning subtle differences in X-rays when classifying wrist pathologies, as many of these pathologies, such as fractures, can be small and hard to distinguish. This study tackles wrist pathology recognition as a fine-grained visual recognition (FGVR) problem, utilizing a limited, custom-curated dataset that mirrors real-world medical constraints, relying solely on image-level annotations. We introduce a specialized FGVR-based ensemble approach to identify discriminative regions within X-rays. We employ an Explainable AI (XAI) technique called Grad-CAM to pinpoint these regions. Our ensemble approach outperformed many conventional SOTA and FGVR techniques, underscoring the effectiveness of our strategy in enhancing accuracy in wrist pathology recognition.
CVJul 17, 2025
Demographic-aware fine-grained classification of pediatric wrist fracturesAmmar Ahmed, Ali Shariq Imran, Zenun Kastrati et al.
Wrist pathologies are frequently observed, particularly among children who constitute the majority of fracture cases. Computer vision presents a promising avenue, contingent upon the availability of extensive datasets, a notable challenge in medical imaging. Therefore, reliance solely on one modality, such as images, proves inadequate, especially in an era of diverse and plentiful data types. This study addresses the problem using a multifaceted approach: framing it as a fine-grained recognition task, fusing patient metadata with X-rays, and leveraging weights from a separate fine-grained dataset rather than from a coarse-grained dataset like ImageNet. Unlike prior work, this is the first application of metadata integration for wrist pathology recognition. Our results show that combining fine-grained transformer approach, fine-grained pre-training, and metadata integration improves diagnostic accuracy by 2% on small custom curated dataset and over 10% on a larger fracture dataset.
SIAug 23, 2020
Cross-Cultural Polarity and Emotion Detection Using Sentiment Analysis and Deep Learning -- a Case Study on COVID-19Ali Shariq Imran, Sher Mohammad Doudpota, Zenun Kastrati et al.
How different cultures react and respond given a crisis is predominant in a society's norms and political will to combat the situation. Often the decisions made are necessitated by events, social pressure, or the need of the hour, which may not represent the will of the nation. While some are pleased with it, others might show resentment. Coronavirus (COVID-19) brought a mix of similar emotions from the nations towards the decisions taken by their respective governments. Social media was bombarded with posts containing both positive and negative sentiments on the COVID-19, pandemic, lockdown, hashtags past couple of months. Despite geographically close, many neighboring countries reacted differently to one another. For instance, Denmark and Sweden, which share many similarities, stood poles apart on the decision taken by their respective governments. Yet, their nation's support was mostly unanimous, unlike the South Asian neighboring countries where people showed a lot of anxiety and resentment. This study tends to detect and analyze sentiment polarity and emotions demonstrated during the initial phase of the pandemic and the lockdown period employing natural language processing (NLP) and deep learning techniques on Twitter posts. Deep long short-term memory (LSTM) models used for estimating the sentiment polarity and emotions from extracted tweets have been trained to achieve state-of-the-art accuracy on the sentiment140 dataset. The use of emoticons showed a unique and novel way of validating the supervised deep learning models on tweets extracted from Twitter.
CVJun 11, 2020
Kalman Filter Based Multiple Person Head TrackingMohib Ullah, Maqsood Mahmud, Habib Ullah et al.
For multi-target tracking, target representation plays a crucial rule in performance. State-of-the-art approaches rely on the deep learning-based visual representation that gives an optimal performance at the cost of high computational complexity. In this paper, we come up with a simple yet effective target representation for human tracking. Our inspiration comes from the fact that the human body goes through severe deformation and inter/intra occlusion over the passage of time. So, instead of tracking the whole body part, a relative rigid organ tracking is selected for tracking the human over an extended period of time. Hence, we followed the tracking-by-detection paradigm and generated the target hypothesis of only the spatial locations of heads in every frame. After the localization of head location, a Kalman filter with a constant velocity motion model is instantiated for each target that follows the temporal evolution of the targets in the scene. For associating the targets in the consecutive frames, combinatorial optimization is used that associates the corresponding targets in a greedy fashion. Qualitative results are evaluated on four challenging video surveillance dataset and promising results has been achieved.