CLSep 21, 2023
Improving VTE Identification through Adaptive NLP Model Selection and Clinical Expert Rule-based Classifier from Radiology ReportsJamie Deng, Yusen Wu, Hilary Hayssen et al.
Rapid and accurate identification of Venous thromboembolism (VTE), a severe cardiovascular condition including deep vein thrombosis (DVT) and pulmonary embolism (PE), is important for effective treatment. Leveraging Natural Language Processing (NLP) on radiology reports, automated methods have shown promising advancements in identifying VTE events from retrospective data cohorts or aiding clinical experts in identifying VTE events from radiology reports. However, effectively training Deep Learning (DL) and the NLP models is challenging due to limited labeled medical text data, the complexity and heterogeneity of radiology reports, and data imbalance. This study proposes novel method combinations of DL methods, along with data augmentation, adaptive pre-trained NLP model selection, and a clinical expert NLP rule-based classifier, to improve the accuracy of VTE identification in unstructured (free-text) radiology reports. Our experimental results demonstrate the model's efficacy, achieving an impressive 97\% accuracy and 97\% F1 score in predicting DVT, and an outstanding 98.3\% accuracy and 98.4\% F1 score in predicting PE. These findings emphasize the model's robustness and its potential to significantly contribute to VTE research.
CRSep 21, 2023
Enabling Quartile-based Estimated-Mean Gradient Aggregation As Baseline for Federated Image ClassificationsYusen Wu, Jamie Deng, Hao Chen et al.
Federated Learning (FL) has revolutionized how we train deep neural networks by enabling decentralized collaboration while safeguarding sensitive data and improving model performance. However, FL faces two crucial challenges: the diverse nature of data held by individual clients and the vulnerability of the FL system to security breaches. This paper introduces an innovative solution named Estimated Mean Aggregation (EMA) that not only addresses these challenges but also provides a fundamental reference point as a $\mathsf{baseline}$ for advanced aggregation techniques in FL systems. EMA's significance lies in its dual role: enhancing model security by effectively handling malicious outliers through trimmed means and uncovering data heterogeneity to ensure that trained models are adaptable across various client datasets. Through a wealth of experiments, EMA consistently demonstrates high accuracy and area under the curve (AUC) compared to alternative methods, establishing itself as a robust baseline for evaluating the effectiveness and security of FL aggregation methods. EMA's contributions thus offer a crucial step forward in advancing the efficiency, security, and versatility of decentralized deep learning in the context of FL.
LGSep 21, 2023
Soft Merging: A Flexible and Robust Soft Model Merging Approach for Enhanced Neural Network PerformanceHao Chen, Yusen Wu, Phuong Nguyen et al.
Stochastic Gradient Descent (SGD), a widely used optimization algorithm in deep learning, is often limited to converging to local optima due to the non-convex nature of the problem. Leveraging these local optima to improve model performance remains a challenging task. Given the inherent complexity of neural networks, the simple arithmetic averaging of the obtained local optima models in undesirable results. This paper proposes a {\em soft merging} method that facilitates rapid merging of multiple models, simplifies the merging of specific parts of neural networks, and enhances robustness against malicious models with extreme values. This is achieved by learning gate parameters through a surrogate of the $l_0$ norm using hard concrete distribution without modifying the model weights of the given local optima models. This merging process not only enhances the model performance by converging to a better local optimum, but also minimizes computational costs, offering an efficient and explicit learning process integrated with stochastic gradient descent. Thorough experiments underscore the effectiveness and superior performance of the merged neural networks.
CLJan 7, 2024Code
CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information Retrieval and Entailment TasksChau Nguyen, Phuong Nguyen, Thanh Tran et al.
The Competition on Legal Information Extraction/Entailment (COLIEE) is held annually to encourage advancements in the automatic processing of legal texts. Processing legal documents is challenging due to the intricate structure and meaning of legal language. In this paper, we outline our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition. As a result, our performance in these tasks has been outstanding, with first places in Task 2 and Task 3, and promising results in Task 4. Our source code is available at https://github.com/Nguyen2015/CAPTAIN-COLIEE2023/tree/coliee2023.
LGAug 16, 2024
Improving VTE Identification through Language Models from Radiology Reports: A Comparative Study of Mamba, Phi-3 Mini, and BERTJamie Deng, Yusen Wu, Yelena Yesha et al.
Venous thromboembolism (VTE) is a critical cardiovascular condition, encompassing deep vein thrombosis (DVT) and pulmonary embolism (PE). Accurate and timely identification of VTE is essential for effective medical care. This study builds upon our previous work, which addressed VTE detection using deep learning methods for DVT and a hybrid approach combining deep learning and rule-based classification for PE. Our earlier approaches, while effective, had two major limitations: they were complex and required expert involvement for feature engineering of the rule set. To overcome these challenges, we utilize the Mamba architecture-based classifier. This model achieves remarkable results, with a 97\% accuracy and F1 score on the DVT dataset and a 98\% accuracy and F1 score on the PE dataset. In contrast to the previous hybrid method on PE identification, the Mamba classifier eliminates the need for hand-engineered rules, significantly reducing model complexity while maintaining comparable performance. Additionally, we evaluated a lightweight Large Language Model (LLM), Phi-3 Mini, in detecting VTE. While this model delivers competitive results, outperforming the baseline BERT models, it proves to be computationally intensive due to its larger parameter set. Our evaluation shows that the Mamba-based model demonstrates superior performance and efficiency in VTE identification, offering an effective solution to the limitations of previous approaches.
IVAug 15, 2024
Distributional Drift Detection in Medical Imaging with Sketching and Fine-Tuned TransformerYusen Wu, Phuong Nguyen, Rose Yesha et al.
Distributional drift detection is important in medical applications as it helps ensure the accuracy and reliability of models by identifying changes in the underlying data distribution that could affect the prediction results of machine learning models. However, current methods have limitations in detecting drift, for example, the inclusion of abnormal datasets can lead to unfair comparisons. This paper presents an accurate and sensitive approach to detect distributional drift in CT-scan medical images by leveraging data-sketching and fine-tuning techniques. We developed a robust baseline library model for real-time anomaly detection, allowing for efficient comparison of incoming images and identification of anomalies. Additionally, we fine-tuned a pre-trained Vision Transformer model to extract relevant features, using mammography as a case study, significantly enhancing model accuracy to 99.11%. Combining with data-sketches and fine-tuning, our feature extraction evaluation demonstrated that cosine similarity scores between similar datasets provide greater improvements, from around 50% increased to 99.1%. Finally, the sensitivity evaluation shows that our solutions are highly sensitive to even 1% salt-and-pepper and speckle noise, and it is not sensitive to lighting noise (e.g., lighting conditions have no impact on data drift). The proposed methods offer a scalable and reliable solution for maintaining the accuracy of diagnostic models in dynamic clinical environments.
CLApr 14, 2024
Cross-Data Knowledge Graph Construction for LLM-enabled Educational Question-Answering System: A Case Study at HCMUTTuan Bui, Oanh Tran, Phuong Nguyen et al.
In today's rapidly evolving landscape of Artificial Intelligence, large language models (LLMs) have emerged as a vibrant research topic. LLMs find applications in various fields and contribute significantly. Despite their powerful language capabilities, similar to pre-trained language models (PLMs), LLMs still face challenges in remembering events, incorporating new information, and addressing domain-specific issues or hallucinations. To overcome these limitations, researchers have proposed Retrieval-Augmented Generation (RAG) techniques, some others have proposed the integration of LLMs with Knowledge Graphs (KGs) to provide factual context, thereby improving performance and delivering more accurate feedback to user queries. Education plays a crucial role in human development and progress. With the technology transformation, traditional education is being replaced by digital or blended education. Therefore, educational data in the digital environment is increasing day by day. Data in higher education institutions are diverse, comprising various sources such as unstructured/structured text, relational databases, web/app-based API access, etc. Constructing a Knowledge Graph from these cross-data sources is not a simple task. This article proposes a method for automatically constructing a Knowledge Graph from multiple data sources and discusses some initial applications (experimental trials) of KG in conjunction with LLMs for question-answering tasks.
CLJun 23, 2025
Enhancing Entity Aware Machine Translation with Multi-task LearningAn Trieu, Phuong Nguyen, Minh Le Nguyen
Entity-aware machine translation (EAMT) is a complicated task in natural language processing due to not only the shortage of translation data related to the entities needed to translate but also the complexity in the context needed to process while translating those entities. In this paper, we propose a method that applies multi-task learning to optimize the performance of the two subtasks named entity recognition and machine translation, which improves the final performance of the Entity-aware machine translation task. The result and analysis are performed on the dataset provided by the organizer of Task 2 of the SemEval 2025 competition.
LGNov 29, 2021
SPATL: Salient Parameter Aggregation and Transfer Learning for Heterogeneous Clients in Federated LearningSixing Yu, Phuong Nguyen, Waqwoya Abebe et al.
Federated learning~(FL) facilitates the training and deploying AI models on edge devices. Preserving user data privacy in FL introduces several challenges, including expensive communication costs, limited resources, and data heterogeneity. In this paper, we propose SPATL, an FL method that addresses these issues by: (a) introducing a salient parameter selection agent and communicating selected parameters only; (b) splitting a model into a shared encoder and a local predictor, and transferring its knowledge to heterogeneous clients via the locally customized predictor. Additionally, we leverage a gradient control mechanism to further speed up model convergence and increase robustness of training processes. Experiments demonstrate that SPATL reduces communication overhead, accelerates model inference, and enables stable training processes with better results compared to state-of-the-art methods. Our approach reduces communication cost by up to $86.45\%$, accelerates local inference by reducing up to $39.7\%$ FLOPs on VGG-11, and requires $7.4 \times$ less communication overhead when training ResNet-20.
IVOct 1, 2021
CCS-GAN: COVID-19 CT-scan classification with very few positive training imagesSumeet Menon, Jayalakshmi Mangalagiri, Josh Galita et al.
We present a novel algorithm that is able to classify COVID-19 pneumonia from CT Scan slices using a very small sample of training images exhibiting COVID-19 pneumonia in tandem with a larger number of normal images. This algorithm is able to achieve high classification accuracy using as few as 10 positive training slices (from 10 positive cases), which to the best of our knowledge is one order of magnitude fewer than the next closest published work at the time of writing. Deep learning with extremely small positive training volumes is a very difficult problem and has been an important topic during the COVID-19 pandemic, because for quite some time it was difficult to obtain large volumes of COVID-19 positive images for training. Algorithms that can learn to screen for diseases using few examples are an important area of research. We present the Cycle Consistent Segmentation Generative Adversarial Network (CCS-GAN). CCS-GAN combines style transfer with pulmonary segmentation and relevant transfer learning from negative images in order to create a larger volume of synthetic positive images for the purposes of improving diagnostic classification performance. The performance of a VGG-19 classifier plus CCS-GAN was trained using a small sample of positive image slices ranging from at most 50 down to as few as 10 COVID-19 positive CT-scan images. CCS-GAN achieves high accuracy with few positive images and thereby greatly reduces the barrier of acquiring large training volumes in order to train a diagnostic classifier for COVID-19.
CRSep 5, 2021
Tolerating Adversarial Attacks and Byzantine Faults in Distributed Machine LearningYusen Wu, Hao Chen, Xin Wang et al.
Adversarial attacks attempt to disrupt the training, retraining and utilizing of artificial intelligence and machine learning models in large-scale distributed machine learning systems. This causes security risks on its prediction outcome. For example, attackers attempt to poison the model by either presenting inaccurate misrepresentative data or altering the models' parameters. In addition, Byzantine faults including software, hardware, network issues occur in distributed systems which also lead to a negative impact on the prediction outcome. In this paper, we propose a novel distributed training algorithm, partial synchronous stochastic gradient descent (ParSGD), which defends adversarial attacks and/or tolerates Byzantine faults. We demonstrate the effectiveness of our algorithm under three common adversarial attacks again the ML models and a Byzantine fault during the training phase. Our results show that using ParSGD, ML models can still produce accurate predictions as if it is not being attacked nor having failures at all when almost half of the nodes are being compromised or failed. We will report the experimental evaluations of ParSGD in comparison with other algorithms.
LGJun 13, 2021
Heterogeneous Federated Learning using Dynamic Model Pruning and Adaptive GradientSixing Yu, Phuong Nguyen, Ali Anwar et al.
Federated Learning (FL) has emerged as a new paradigm for training machine learning models distributively without sacrificing data security and privacy. Learning models on edge devices such as mobile phones is one of the most common use cases for FL. However, Non-identical independent distributed~(non-IID) data in edge devices easily leads to training failures. Especially, over-parameterized machine learning models can easily be over-fitted on such data, hence, resulting in inefficient federated learning and poor model performance. To overcome the over-fitting issue, we proposed an adaptive dynamic pruning approach for FL, which can dynamically slim the model by dropping out unimportant parameters, hence, preventing over-fittings. Since the machine learning model's parameters react differently for different training samples, adaptive dynamic pruning will evaluate the salience of the model's parameter according to the input training sample, and only retain the salient parameter's gradients when doing back-propagation. We performed comprehensive experiments to evaluate our approach. The results show that our approach by removing the redundant parameters in neural networks can significantly reduce the over-fitting issue and greatly improves the training efficiency. In particular, when training the ResNet-32 on CIFAR-10, our approach reduces the communication cost by 57\%. We further demonstrate the inference acceleration capability of the proposed algorithm. Our approach reduces up to 50\% FLOPs inference of DNNs on edge devices while maintaining the model's quality.
IVApr 2, 2021
Toward Generating Synthetic CT Volumes using a 3D-Conditional Generative Adversarial NetworkJayalakshmi Mangalagiri, David Chapman, Aryya Gangopadhyay et al.
We present a novel conditional Generative Adversarial Network (cGAN) architecture that is capable of generating 3D Computed Tomography scans in voxels from noisy and/or pixelated approximations and with the potential to generate full synthetic 3D scan volumes. We believe conditional cGAN to be a tractable approach to generate 3D CT volumes, even though the problem of generating full resolution deep fakes is presently impractical due to GPU memory limitations. We present results for autoencoder, denoising, and depixelating tasks which are trained and tested on two novel COVID19 CT datasets. Our evaluation metrics, Peak Signal to Noise ratio (PSNR) range from 12.53 - 46.46 dB, and the Structural Similarity index ( SSIM) range from 0.89 to 1.
IVOct 19, 2020
Lung Nodule Classification Using Biomarkers, Volumetric Radiomics and 3D CNNsKushal Mehta, Arshita Jain, Jayalakshmi Mangalagiri et al.
We present a hybrid algorithm to estimate lung nodule malignancy that combines imaging biomarkers from Radiologist's annotation with image classification of CT scans. Our algorithm employs a 3D Convolutional Neural Network (CNN) as well as a Random Forest in order to combine CT imagery with biomarker annotation and volumetric radiomic features. We analyze and compare the performance of the algorithm using only imagery, only biomarkers, combined imagery + biomarkers, combined imagery + volumetric radiomic features and finally the combination of imagery + biomarkers + volumetric features in order to classify the suspicion level of nodule malignancy. The National Cancer Institute (NCI) Lung Image Database Consortium (LIDC) IDRI dataset is used to train and evaluate the classification task. We show that the incorporation of semi-supervised learning by means of K-Nearest-Neighbors (KNN) can increase the available training sample size of the LIDC-IDRI thereby further improving the accuracy of malignancy estimation of most of the models tested although there is no significant improvement with the use of KNN semi-supervised learning if image classification with CNNs and volumetric features are combined with descriptive biomarkers. Unexpectedly, we also show that a model using image biomarkers alone is more accurate than one that combines biomarkers with volumetric radiomics, 3D CNNs, and semi-supervised learning. We discuss the possibility that this result may be influenced by cognitive bias in LIDC-IDRI because malignancy estimates were recorded by the same radiologist panel as biomarkers, as well as future work to incorporate pathology information over a subset of study participants.
LGOct 2, 2020
Deep Expectation-Maximization for Semi-Supervised Lung Cancer ScreeningSumeet Menon, David Chapman, Phuong Nguyen et al.
We present a semi-supervised algorithm for lung cancer screening in which a 3D Convolutional Neural Network (CNN) is trained using the Expectation-Maximization (EM) meta-algorithm. Semi-supervised learning allows a smaller labelled data-set to be combined with an unlabeled data-set in order to provide a larger and more diverse training sample. EM allows the algorithm to simultaneously calculate a maximum likelihood estimate of the CNN training coefficients along with the labels for the unlabeled training set which are defined as a latent variable space. We evaluate the model performance of the Semi-Supervised EM algorithm for CNNs through cross-domain training of the Kaggle Data Science Bowl 2017 (Kaggle17) data-set with the National Lung Screening Trial (NLST) data-set. Our results show that the Semi-Supervised EM algorithm greatly improves the classification accuracy of the cross-domain lung cancer screening, although results are lower than a fully supervised approach with the advantage of additional labelled data from the unsupervised sample. As such, we demonstrate that Semi-Supervised EM is a valuable technique to improve the accuracy of lung cancer screening models using 3D CNNs.
LGSep 26, 2020
Generating Realistic COVID19 X-rays with a Mean Teacher + Transfer Learning GANSumeet Menon, Joshua Galita, David Chapman et al.
COVID-19 is a novel infectious disease responsible for over 800K deaths worldwide as of August 2020. The need for rapid testing is a high priority and alternative testing strategies including X-ray image classification are a promising area of research. However, at present, public datasets for COVID19 x-ray images have low data volumes, making it challenging to develop accurate image classifiers. Several recent papers have made use of Generative Adversarial Networks (GANs) in order to increase the training data volumes. But realistic synthetic COVID19 X-rays remain challenging to generate. We present a novel Mean Teacher + Transfer GAN (MTT-GAN) that generates COVID19 chest X-ray images of high quality. In order to create a more accurate GAN, we employ transfer learning from the Kaggle Pneumonia X-Ray dataset, a highly relevant data source orders of magnitude larger than public COVID19 datasets. Furthermore, we employ the Mean Teacher algorithm as a constraint to improve stability of training. Our qualitative analysis shows that the MTT-GAN generates X-ray images that are greatly superior to a baseline GAN and visually comparable to real X-rays. Although board-certified radiologists can distinguish MTT-GAN fakes from real COVID19 X-rays. Quantitative analysis shows that MTT-GAN greatly improves the accuracy of both a binary COVID19 classifier as well as a multi-class Pneumonia classifier as compared to a baseline GAN. Our classification accuracy is favourable as compared to recently reported results in the literature for similar binary and multi-class COVID19 screening tasks.
AIJun 12, 2017
DAC-h3: A Proactive Robot Cognitive Architecture to Acquire and Express Knowledge About the World and the SelfClément Moulin-Frier, Tobias Fischer, Maxime Petit et al.
This paper introduces a cognitive architecture for a humanoid robot to engage in a proactive, mixed-initiative exploration and manipulation of its environment, where the initiative can originate from both the human and the robot. The framework, based on a biologically-grounded theory of the brain and mind, integrates a reactive interaction engine, a number of state-of-the-art perceptual and motor learning algorithms, as well as planning abilities and an autobiographical memory. The architecture as a whole drives the robot behavior to solve the symbol grounding problem, acquire language capabilities, execute goal-oriented behavior, and express a verbal narrative of its own experience in the world. We validate our approach in human-robot interaction experiments with the iCub humanoid robot, showing that the proposed cognitive architecture can be applied in real time within a realistic scenario and that it can be used with naive users.
LGFeb 11, 2013
Optimal Regret Bounds for Selecting the State Representation in Reinforcement LearningOdalric-Ambrym Maillard, Phuong Nguyen, Ronald Ortner et al.
We consider an agent interacting with an environment in a single stream of actions, observations, and rewards, with no reset. This process is not assumed to be a Markov Decision Process (MDP). Rather, the agent has several representations (mapping histories of past interactions to a discrete state space) of the environment with unknown dynamics, only some of which result in an MDP. The goal is to minimize the average regret criterion against an agent who knows an MDP representation giving the highest optimal reward, and acts optimally in it. Recent regret bounds for this setting are of order $O(T^{2/3})$ with an additive term constant yet exponential in some characteristics of the optimal MDP. We propose an algorithm whose regret after $T$ time steps is $O(\sqrt{T})$, with all constants reasonably small. This is optimal in $T$ since $O(\sqrt{T})$ is the optimal regret in the setting of learning in a (single discrete) MDP.