Tai Tan Mai

CY
h-index16
6papers
10citations
Novelty23%
AI Score39

6 Papers

28.4AIApr 2
Retrieval-aligned Tabular Foundation Models Enable Robust Clinical Risk Prediction in Electronic Health Records Under Real-world Constraints

Minh-Khoi Pham, Thang-Long Nguyen Ho, Thao Thi Phuong Dao et al.

Clinical prediction from structured electronic health records (EHRs) is challenging due to high dimensionality, heterogeneity, class imbalance, and distribution shift. While tabular in-context learning (TICL) and retrieval-augmented methods perform well on generic benchmarks, their behavior in clinical settings remains unclear. We present a multi-cohort EHR benchmark comparing classical, deep tabular, and TICL models across varying data scale, feature dimensionality, outcome rarity, and cross-cohort generalization. PFN-based TICL models are sample-efficient in low-data regimes but degrade under naive distance-based retrieval as heterogeneity and imbalance increase. We propose AWARE, a task-aligned retrieval framework using supervised embedding learning and lightweight adapters. AWARE improves AUPRC by up to 12.2% under extreme imbalance, with gains increasing with data complexity. Our results identify retrieval quality and retrieval-inference alignment as key bottlenecks for deploying tabular in-context learning in clinical prediction.

CYSep 19, 2024
ARTAI: An Evaluation Platform to Assess Societal Risk of Recommender Algorithms

Qin Ruan, Jin Xu, Ruihai Dong et al.

Societal risk emanating from how recommender algorithms disseminate content online is now well documented. Emergent regulation aims to mitigate this risk through ethical audits and enabling new research on the social impact of algorithms. However, there is currently a need for tools and methods that enable such evaluation. This paper presents ARTAI, an evaluation environment that enables large-scale assessments of recommender algorithms to identify harmful patterns in how content is distributed online and enables the implementation of new regulatory requirements for increased transparency in recommender systems.

CYOct 4, 2023
Key Factors Affecting European Reactions to AI in European Full and Flawed Democracies

Long Pham, Barry O'Sullivan, Tai Tan Mai

This study examines the key factors that affect European reactions to artificial intelligence (AI) in the context of both full and flawed democracies in Europe. Analysing a dataset of 4,006 respondents, categorised into full democracies and flawed democracies based on the Democracy Index developed by the Economist Intelligence Unit (EIU), this research identifies crucial factors that shape European attitudes toward AI in these two types of democracies. The analysis reveals noteworthy findings. Firstly, it is observed that flawed democracies tend to exhibit higher levels of trust in government entities compared to their counterparts in full democracies. Additionally, individuals residing in flawed democracies demonstrate a more positive attitude toward AI when compared to respondents from full democracies. However, the study finds no significant difference in AI awareness between the two types of democracies, indicating a similar level of general knowledge about AI technologies among European citizens. Moreover, the study reveals that trust in AI measures, specifically "Trust AI Solution", does not significantly vary between full and flawed democracies. This suggests that despite the differences in democratic quality, both types of democracies have similar levels of confidence in AI solutions.

CVJul 30, 2025Code
Exploring the Application of Visual Question Answering (VQA) for Classroom Activity Monitoring

Sinh Trong Vu, Hieu Trung Pham, Dung Manh Nguyen et al.

Classroom behavior monitoring is a critical aspect of educational research, with significant implications for student engagement and learning outcomes. Recent advancements in Visual Question Answering (VQA) models offer promising tools for automatically analyzing complex classroom interactions from video recordings. In this paper, we investigate the applicability of several state-of-the-art open-source VQA models, including LLaMA2, LLaMA3, QWEN3, and NVILA, in the context of classroom behavior analysis. To facilitate rigorous evaluation, we introduce our BAV-Classroom-VQA dataset derived from real-world classroom video recordings at the Banking Academy of Vietnam. We present the methodology for data collection, annotation, and benchmark the performance of the selected VQA models on this dataset. Our initial experimental results demonstrate that all four models achieve promising performance levels in answering behavior-related visual questions, showcasing their potential in future classroom analytics and intervention systems.

CYJul 26, 2025
A ChatGPT-based approach for questions generation in higher education

Sinh Trong Vu, Huong Thu Truong, Oanh Tien Do et al.

Large language models have been widely applied in many aspects of real life, bringing significant efficiency to businesses and offering distinctive user experiences. In this paper, we focus on exploring the application of ChatGPT, a chatbot based on a large language model, to support higher educator in generating quiz questions and assessing learners. Specifically, we explore interactive prompting patterns to design an optimal AI-powered question bank creation process. The generated questions are evaluated through a "Blind test" survey sent to various stakeholders including lecturers and learners. Initial results at the Banking Academy of Vietnam are relatively promising, suggesting a potential direction to streamline the time and effort involved in assessing learners at higher education institutes.

AISep 18, 2025
Explainable AI for Infection Prevention and Control: Modeling CPE Acquisition and Patient Outcomes in an Irish Hospital with Transformers

Minh-Khoi Pham, Tai Tan Mai, Martin Crane et al.

Carbapenemase-Producing Enterobacteriace poses a critical concern for infection prevention and control in hospitals. However, predictive modeling of previously highlighted CPE-associated risks such as readmission, mortality, and extended length of stay (LOS) remains underexplored, particularly with modern deep learning approaches. This study introduces an eXplainable AI modeling framework to investigate CPE impact on patient outcomes from Electronic Medical Records data of an Irish hospital. We analyzed an inpatient dataset from an Irish acute hospital, incorporating diagnostic codes, ward transitions, patient demographics, infection-related variables and contact network features. Several Transformer-based architectures were benchmarked alongside traditional machine learning models. Clinical outcomes were predicted, and XAI techniques were applied to interpret model decisions. Our framework successfully demonstrated the utility of Transformer-based models, with TabTransformer consistently outperforming baselines across multiple clinical prediction tasks, especially for CPE acquisition (AUROC and sensitivity). We found infection-related features, including historical hospital exposure, admission context, and network centrality measures, to be highly influential in predicting patient outcomes and CPE acquisition risk. Explainability analyses revealed that features like "Area of Residence", "Admission Ward" and prior admissions are key risk factors. Network variables like "Ward PageRank" also ranked highly, reflecting the potential value of structural exposure information. This study presents a robust and explainable AI framework for analyzing complex EMR data to identify key risk factors and predict CPE-related outcomes. Our findings underscore the superior performance of the Transformer models and highlight the importance of diverse clinical and network features.