Alireza Rafiei

LG
h-index44
8papers
35citations
Novelty37%
AI Score42

8 Papers

96.9AIJun 3
Agents' Last Exam

Yiyou Sun, Xinyang Han, Weichen Zhang et al.

Recent AI systems have achieved strong results on a wide range of benchmarks, yet these gains have not translated into economically meaningful deployment across many professional domains. We argue that this gap is largely an evaluation problem: widely used benchmarks lack sustained performance measurement on real and economically valuable workflows. This paper introduces Agents' Last Exam (ALE), a benchmark designed to evaluate AI agents on long-horizon, economically valuable, real-world tasks with verifiable outcomes. Developed in collaboration with 250+ industry experts, ALE covers non-physical industries defined with reference to O*NET / SOC 2018 (the U.S. federal occupational taxonomy). It is organized around a task taxonomy with 55 subfields grouped into 13 industry clusters covering 1K+ tasks. Current results show that the hardest tier remains far from saturated: across mainstream harness and backbone configurations, the average full pass rate is 2.6%. ALE is designed as a living benchmark: its task pool grows continuously as new workflows and industries are onboarded. More broadly, ALE is intended not merely as another leaderboard, but as an instrument for closing the gap between benchmark success and GDP-relevant impact.

LGAug 5, 2023
Meta-learning in healthcare: A survey

Alireza Rafiei, Ronald Moore, Sina Jahromi et al.

As a subset of machine learning, meta-learning, or learning to learn, aims at improving the model's capabilities by employing prior knowledge and experience. A meta-learning paradigm can appropriately tackle the conventional challenges of traditional learning approaches, such as insufficient number of samples, domain shifts, and generalization. These unique characteristics position meta-learning as a suitable choice for developing influential solutions in various healthcare contexts, where the available data is often insufficient, and the data collection methodologies are different. This survey discusses meta-learning broad applications in the healthcare domain to provide insight into how and where it can address critical healthcare challenges. We first describe the theoretical foundations and pivotal methods of meta-learning. We then divide the employed meta-learning approaches in the healthcare domain into two main categories of multi/single-task learning and many/few-shot learning and survey the studies. Finally, we highlight the current challenges in meta-learning research, discuss the potential solutions, and provide future perspectives on meta-learning in healthcare.

LGDec 30, 2025
Early Prediction of Sepsis using Heart Rate Signals and Genetic Optimized LSTM Algorithm

Alireza Rafiei, Farshid Hajati, Alireza Rezaee et al.

Sepsis, characterized by a dysregulated immune response to infection, results in significant mortality, morbidity, and healthcare costs. The timely prediction of sepsis progression is crucial for reducing adverse outcomes through early intervention. Despite the development of numerous models for Intensive Care Unit (ICU) patients, there remains a notable gap in approaches for the early detection of sepsis in non-ward settings. This research introduces and evaluates four novel machine learning algorithms designed for predicting the onset of sepsis on wearable devices by analyzing heart rate data. The architecture of these models was refined through a genetic algorithm, optimizing for performance, computational complexity, and memory requirements. Performance metrics were subsequently extracted for each model to evaluate their feasibility for implementation on wearable devices capable of accurate heart rate monitoring. The models were initially tailored for a prediction window of one hour, later extended to four hours through transfer learning. The encouraging outcomes of this study suggest the potential for wearable technology to facilitate early sepsis detection outside ICU and ward environments.

41.9SPApr 24
Multi-View Hierarchical Representation Learning of Fetal Hemodynamics for Maternal Hypertension Detection at the Edge

Alireza Rafiei, Anahí Venzor Strader, Esteban Castro Aragón et al.

Hypertensive disorders of pregnancy remain a leading cause of maternal and fetal morbidity worldwide, yet diagnosis relies on intermittent cuff-based blood pressure measurements that are prone to bias and fail to capture continuous physiological dynamics. Growing evidence suggests that fetal cardiovascular activity is associated with maternal-placental hemodynamics and may encode markers of maternal hypertension. To analyze this, we collected a large-scale dataset of fetal one-dimensional Doppler ultrasound recordings paired with maternal blood pressure from 3,255 pregnant women across 8,170 antenatal visits in rural Guatemala. We developed AutoHyPE, a hierarchical attention network that models short- and long-term signal structure, incorporating a novel prototype-based contrastive learning and multi-view strategy to enhance representation robustness under long-tailed class distribution and biological variability. AutoHyPE achieved an AUROC of 0.80 for maternal hypertension detection, outperforming baseline approaches while maintaining balanced performance across classes, with no performance degradation in an edge deployment scenario. Our findings demonstrated that fetal cardiac mechanical activity contains hemodynamic features indicative of maternal hypertension status. This supports a promising paradigm shift toward continuous, objective monitoring of maternal health using existing, low-cost ultrasound technology and introduces a complementary approach to traditional methods based on blood pressure measurements, advancing scalable prenatal care.

LGJan 1, 2024
Robust Meta-Model for Predicting the Need for Blood Transfusion in Non-traumatic ICU Patients

Alireza Rafiei, Ronald Moore, Tilendra Choudhary et al.

Objective: Blood transfusions, crucial in managing anemia and coagulopathy in ICU settings, require accurate prediction for effective resource allocation and patient risk assessment. However, existing clinical decision support systems have primarily targeted a particular patient demographic with unique medical conditions and focused on a single type of blood transfusion. This study aims to develop an advanced machine learning-based model to predict the probability of transfusion necessity over the next 24 hours for a diverse range of non-traumatic ICU patients. Methods: We conducted a retrospective cohort study on 72,072 adult non-traumatic ICU patients admitted to a high-volume US metropolitan academic hospital between 2016 and 2020. We developed a meta-learner and various machine learning models to serve as predictors, training them annually with four-year data and evaluating on the fifth, unseen year, iteratively over five years. Results: The experimental results revealed that the meta-model surpasses the other models in different development scenarios. It achieved notable performance metrics, including an Area Under the Receiver Operating Characteristic (AUROC) curve of 0.97, an accuracy rate of 0.93, and an F1-score of 0.89 in the best scenario. Conclusion: This study pioneers the use of machine learning models for predicting blood transfusion needs in a diverse cohort of critically ill patients. The findings of this evaluation confirm that our model not only predicts transfusion requirements effectively but also identifies key biomarkers for making transfusion decisions.

LGMar 12, 2025
Edge AI for Real-time Fetal Assessment in Rural Guatemala

Nasim Katebi, Mohammad Ahmad, Mohsen Motie-Shirazi et al.

Perinatal complications, defined as conditions that arise during pregnancy, childbirth, and the immediate postpartum period, represent a significant burden on maternal and neonatal health worldwide. Factors contributing to these disparities include limited access to quality healthcare, socioeconomic inequalities, and variations in healthcare infrastructure. Addressing these issues is crucial for improving health outcomes for mothers and newborns, particularly in underserved communities. To mitigate these challenges, we have developed an AI-enabled smartphone application designed to provide decision support at the point-of-care. This tool aims to enhance health monitoring during pregnancy by leveraging machine learning (ML) techniques. The intended use of this application is to assist midwives during routine home visits by offering real-time analysis and providing feedback based on collected data. The application integrates TensorFlow Lite (TFLite) and other Python-based algorithms within a Kotlin framework to process data in real-time. It is designed for use in low-resource settings, where traditional healthcare infrastructure may be lacking. The intended patient population includes pregnant women and new mothers in underserved areas and the developed system was piloted in rural Guatemala. This ML-based solution addresses the critical need for accessible and quality perinatal care by empowering healthcare providers with decision support tools to improve maternal and neonatal health outcomes.

CLFeb 26, 2024
Social Media as a Sensor: Analyzing Twitter Data for Breast Cancer Medication Effects Using Natural Language Processing

Seibi Kobara, Alireza Rafiei, Masoud Nateghi et al.

Breast cancer is a significant public health concern and is the leading cause of cancer-related deaths among women. Despite advances in breast cancer treatments, medication non-adherence remains a major problem. As electronic health records do not typically capture patient-reported outcomes that may reveal information about medication-related experiences, social media presents an attractive resource for enhancing our understanding of the patients' treatment experiences. In this paper, we developed natural language processing (NLP) based methodologies to study information posted by an automatically curated breast cancer cohort from social media. We employed a transformer-based classifier to identify breast cancer patients/survivors on X (Twitter) based on their self-reported information, and we collected longitudinal data from their profiles. We then designed a multi-layer rule-based model to develop a breast cancer therapy-associated side effect lexicon and detect patterns of medication usage and associated side effects among breast cancer patients. 1,454,637 posts were available from 583,962 unique users, of which 62,042 were detected as breast cancer members using our transformer-based model. 198 cohort members mentioned breast cancer medications with tamoxifen as the most common. Our side effect lexicon identified well-known side effects of hormone and chemotherapy. Furthermore, it discovered a subject feeling towards cancer and medications, which may suggest a pre-clinical phase of side effects or emotional distress. This analysis highlighted not only the utility of NLP techniques in unstructured social media data to identify self-reported breast cancer posts, medication usage patterns, and treatment side effects but also the richness of social data on such clinical questions.

LGApr 17, 2025
Auto-FEDUS: Autoregressive Generative Modeling of Doppler Ultrasound Signals from Fetal Electrocardiograms

Alireza Rafiei, Gari D. Clifford, Nasim Katebi

Fetal health monitoring through one-dimensional Doppler ultrasound (DUS) signals offers a cost-effective and accessible approach that is increasingly gaining interest. Despite its potential, the development of machine learning based techniques to assess the health condition of mothers and fetuses using DUS signals remains limited. This scarcity is primarily due to the lack of extensive DUS datasets with a reliable reference for interpretation and data imbalance across different gestational ages. In response, we introduce a novel autoregressive generative model designed to map fetal electrocardiogram (FECG) signals to corresponding DUS waveforms (Auto-FEDUS). By leveraging a neural temporal network based on dilated causal convolutions that operate directly on the waveform level, the model effectively captures both short and long-range dependencies within the signals, preserving the integrity of generated data. Cross-subject experiments demonstrate that Auto-FEDUS outperforms conventional generative architectures across both time and frequency domain evaluations, producing DUS signals that closely resemble the morphology of their real counterparts. The realism of these synthesized signals was further gauged using a quality assessment model, which classified all as good quality, and a heart rate estimation model, which produced comparable results for generated and real data, with a Bland-Altman limit of 4.5 beats per minute. This advancement offers a promising solution for mitigating limited data availability and enhancing the training of DUS-based fetal models, making them more effective and generalizable.