CVFeb 13, 2023
An Application of Deep Learning for Sweet Cherry Phenotyping using YOLO Object DetectionRitayu Nagpal, Sam Long, Shahid Jahagirdar et al.
Tree fruit breeding is a long-term activity involving repeated measurements of various fruit quality traits on a large number of samples. These traits are traditionally measured by manually counting the fruits, weighing to indirectly measure the fruit size, and fruit colour is classified subjectively into different color categories using visual comparison to colour charts. These processes are slow, expensive and subject to evaluators' bias and fatigue. Recent advancements in deep learning can help automate this process. A method was developed to automatically count the number of sweet cherry fruits in a camera's field of view in real time using YOLOv3. A system capable of analyzing the image data for other traits such as size and color was also developed using Python. The YOLO model obtained close to 99% accuracy in object detection and counting of cherries and 90% on the Intersection over Union metric for object localization when extracting size and colour information. The model surpasses human performance and offers a significant improvement compared to manual counting.
26.0AIMay 18
Learning Quantifiable Visual Explanations Without Ground-TruthAmritpal Singh, Andrey Barsky, Mohamed Ali Souibgui et al.
Explainable AI (XAI) techniques are increasingly important for the validation and responsible use of modern deep learning models, but are difficult to evaluate due to the lack of good ground-truth to compare against. We propose a framework that serves as a quantifiable metric for the quality of XAI methods, based on continuous input perturbation. Our metric formally considers the sufficiency and necessity of the attributed information to the model's decision-making, and we illustrate a range of cases where it aligns better with human intuitions of explanation quality than do existing metrics. To exploit the properties of this metric, we also propose a novel XAI method, considering the case where we fine-tune a model using a differentiable approximation of the metric as a supervision signal. The result is an adapter module that can be trained on top of any black-box model to output causal explanations of the model's decision process, without degrading model performance. We show that the explanations generated by this method outperform those of competing XAI techniques according to a number of quantifiable metrics.
LGNov 7, 2023
Class-Incremental Continual Learning for General Purpose Healthcare ModelsAmritpal Singh, Mustafa Burak Gurbuz, Shiva Souhith Gantha et al.
Healthcare clinics regularly encounter dynamic data that changes due to variations in patient populations, treatment policies, medical devices, and emerging disease patterns. Deep learning models can suffer from catastrophic forgetting when fine-tuned in such scenarios, causing poor performance on previously learned tasks. Continual learning allows learning on new tasks without performance drop on previous tasks. In this work, we investigate the performance of continual learning models on four different medical imaging scenarios involving ten classification datasets from diverse modalities, clinical specialties, and hospitals. We implement various continual learning approaches and evaluate their performance in these scenarios. Our results demonstrate that a single model can sequentially learn new tasks from different specialties and achieve comparable performance to naive methods. These findings indicate the feasibility of recycling or sharing models across the same or different medical specialties, offering another step towards the development of general-purpose medical imaging AI that can be shared across institutions.
LGJul 15, 2024
GraphPrint: Extracting Features from 3D Protein Structure for Drug Target Affinity PredictionAmritpal Singh
Accurate drug target affinity prediction can improve drug candidate selection, accelerate the drug discovery process, and reduce drug production costs. Previous work focused on traditional fingerprints or used features extracted based on the amino acid sequence in the protein, ignoring its 3D structure which affects its binding affinity. In this work, we propose GraphPrint: a framework for incorporating 3D protein structure features for drug target affinity prediction. We generate graph representations for protein 3D structures using amino acid residue location coordinates and combine them with drug graph representation and traditional features to jointly learn drug target affinity. Our model achieves a mean square error of 0.1378 and a concordance index of 0.8929 on the KIBA dataset and improves over using traditional protein features alone. Our ablation study shows that the 3D protein structure-based features provide information complementary to traditional features.
ROMay 26, 2022
Roadmap to Autonomous Surgery -- A Framework to Surgical AutonomyAmritpal Singh
Robotic surgery has increased the domain of surgeries possible. Several examples of partial surgical automation have been seen in the past decade. We break down the path of automation tasks into features required and provide a checklist that can help reach higher levels of surgical automation. Finally, we discuss the current challenges and advances required to make this happen.
LGSep 2, 2023Code
Autonomous Soft Tissue Retraction Using Demonstration-Guided Reinforcement LearningAmritpal Singh, Wenqi Shi, May D Wang
In the context of surgery, robots can provide substantial assistance by performing small, repetitive tasks such as suturing, needle exchange, and tissue retraction, thereby enabling surgeons to concentrate on more complex aspects of the procedure. However, existing surgical task learning mainly pertains to rigid body interactions, whereas the advancement towards more sophisticated surgical robots necessitates the manipulation of soft bodies. Previous work focused on tissue phantoms for soft tissue task learning, which can be expensive and can be an entry barrier to research. Simulation environments present a safe and efficient way to learn surgical tasks before their application to actual tissue. In this study, we create a Robot Operating System (ROS)-compatible physics simulation environment with support for both rigid and soft body interactions within surgical tasks. Furthermore, we investigate the soft tissue interactions facilitated by the patient-side manipulator of the DaVinci surgical robot. Leveraging the pybullet physics engine, we simulate kinematics and establish anchor points to guide the robotic arm when manipulating soft tissue. Using demonstration-guided reinforcement learning (RL) algorithms, we investigate their performance in comparison to traditional reinforcement learning algorithms. Our in silico trials demonstrate a proof-of-concept for autonomous surgical soft tissue retraction. The results corroborate the feasibility of learning soft body manipulation through the application of reinforcement learning agents. This work lays the foundation for future research into the development and refinement of surgical robots capable of managing both rigid and soft tissue interactions. Code is available at https://github.com/amritpal-001/tissue_retract.
CVSep 24, 2024
VisioPhysioENet: Visual Physiological Engagement Detection NetworkAlakhsimar Singh, Kanav Goyal, Nischay Verma et al.
This paper presents VisioPhysioENet, a novel multimodal system that leverages visual and physiological signals to detect learner engagement. It employs a two-level approach for extracting both visual and physiological features. For visual feature extraction, Dlib is used to detect facial landmarks, while OpenCV provides additional estimations. The face recognition library, built on Dlib, is used to identify the facial region of interest specifically for physiological signal extraction. Physiological signals are then extracted using the plane-orthogonal-toskin method to assess cardiovascular activity. These features are integrated using advanced machine learning classifiers, enhancing the detection of various levels of engagement. We thoroughly tested VisioPhysioENet on the DAiSEE dataset. It achieved an accuracy of 63.09%. This shows it can better identify different levels of engagement compared to many existing methods. It performed 8.6% better than the only other model that uses both physiological and visual features.
LGSep 15, 2025
Early Prediction of Multi-Label Care Escalation Triggers in the Intensive Care Unit Using Electronic Health RecordsSyed Ahmad Chan Bukhari, Amritpal Singh, Shifath Hossain et al.
Intensive Care Unit (ICU) patients often present with complex, overlapping signs of physiological deterioration that require timely escalation of care. Traditional early warning systems, such as SOFA or MEWS, are limited by their focus on single outcomes and fail to capture the multi-dimensional nature of clinical decline. This study proposes a multi-label classification framework to predict Care Escalation Triggers (CETs), including respiratory failure, hemodynamic instability, renal compromise, and neurological deterioration, using the first 24 hours of ICU data. Using the MIMIC-IV database, CETs are defined through rule-based criteria applied to data from hours 24 to 72 (for example, oxygen saturation below 90, mean arterial pressure below 65 mmHg, creatinine increase greater than 0.3 mg/dL, or a drop in Glasgow Coma Scale score greater than 2). Features are extracted from the first 24 hours and include vital sign aggregates, laboratory values, and static demographics. We train and evaluate multiple classification models on a cohort of 85,242 ICU stays (80 percent training: 68,193; 20 percent testing: 17,049). Evaluation metrics include per-label precision, recall, F1-score, and Hamming loss. XGBoost, the best performing model, achieves F1-scores of 0.66 for respiratory, 0.72 for hemodynamic, 0.76 for renal, and 0.62 for neurologic deterioration, outperforming baseline models. Feature analysis shows that clinically relevant parameters such as respiratory rate, blood pressure, and creatinine are the most influential predictors, consistent with the clinical definitions of the CETs. The proposed framework demonstrates practical potential for early, interpretable clinical alerts without requiring complex time-series modeling or natural language processing.
LGSep 14, 2025
A Machine Learning Framework for Pathway-Driven Therapeutic Target Discovery in Metabolic DisordersIram Wajahat, Amritpal Singh, Fazel Keshtkar et al.
Metabolic disorders, particularly type 2 diabetes mellitus (T2DM), represent a significant global health burden, disproportionately impacting genetically predisposed populations such as the Pima Indians (a Native American tribe from south central Arizona). This study introduces a novel machine learning (ML) framework that integrates predictive modeling with gene-agnostic pathway mapping to identify high-risk individuals and uncover potential therapeutic targets. Using the Pima Indian dataset, logistic regression and t-tests were applied to identify key predictors of T2DM, yielding an overall model accuracy of 78.43%. To bridge predictive analytics with biological relevance, we developed a pathway mapping strategy that links identified predictors to critical signaling networks, including insulin signaling, AMPK, and PPAR pathways. This approach provides mechanistic insights without requiring direct molecular data. Building upon these connections, we propose therapeutic strategies such as dual GLP-1/GIP receptor agonists, AMPK activators, SIRT1 modulators, and phytochemical, further validated through pathway enrichment analyses. Overall, this framework advances precision medicine by offering interpretable and scalable solutions for early detection and targeted intervention in metabolic disorders. The key contributions of this work are: (1) development of an ML framework combining logistic regression and principal component analysis (PCA) for T2DM risk prediction; (2) introduction of a gene-agnostic pathway mapping approach to generate mechanistic insights; and (3) identification of novel therapeutic strategies tailored for high-risk populations.
CLNov 13, 2014
A Text to Speech (TTS) System with English to Punjabi ConversionPrabhsimran Singh, Amritpal Singh
The paper aims to show how an application can be developed that converts the English language into the Punjabi Language, and the same application can convert the Text to Speech(TTS) i.e. pronounce the text. This application can be really beneficial for those with special needs.
CRJan 14, 2012
Analysis of Different Privacy Preserving Cloud Storage FrameworksRajeev Bedi, Mohit Marwaha, Tajinder Singh et al.
Privacy Security of data in Cloud Storage is one of the main issues. Many Frameworks and Technologies are used to preserve data security in cloud storage. [1] Proposes a framework which includes the design of data organization structure, the generation and management of keys, the treatment of change of user's access right and dynamic operations of data, and the interaction between participants. It also design an interactive protocol and an extirpation-based key derivation algorithm, which are combined with lazy revocation, it uses multi-tree structure and symmetric encryption to form a privacy-preserving, efficient framework for cloud storage. [2] Proposes a framework which design a privacy-preserving cloud storage framework in which he designed an interaction protocol among participants, use key derivation algorithm to generate and manage keys, use both symmetric and asymmetric encryption to hide the sensitive data of users, and apply Bloom filter for cipher text retrieval. A system based on this framework is realized. This paper analyzes both the frameworks in terms of the feasibility of the frameworks, running overhead of the system and the privacy security of the frameworks.