Thanh-Dung Le

LG
h-index76
13papers
93citations
Novelty30%
AI Score24

13 Papers

CVSep 5, 2024
Onboard Satellite Image Classification for Earth Observation: A Comparative Study of ViT Models

Thanh-Dung Le, Vu Nguyen Ha, Ti Ti Nguyen et al.

This study focuses on identifying the most effective pre-trained model for land use classification in onboard satellite processing, emphasizing achieving high accuracy, computational efficiency, and robustness against noisy data conditions commonly encountered during satellite-based inference. Through extensive experimentation, we compare the performance of traditional CNN-based, ResNet-based, and various pre-trained vision Transformer models. Our findings demonstrate that pre-trained Vision Transformer (ViT) models, particularly MobileViTV2 and EfficientViT-M2, outperform models trained from scratch in terms of accuracy and efficiency. These models achieve high performance with reduced computational requirements and exhibit greater resilience during inference under noisy conditions. While MobileViTV2 has excelled on clean validation data, EfficientViT-M2 has proved more robust when handling noise, making it the most suitable model for onboard satellite EO tasks. Our experimental results demonstrate that EfficientViT-M2 is the optimal choice for reliable and efficient RS-IC in satellite operations, achieving 98.76 % of accuracy, precision, and recall. Precisely, EfficientViT-M2 delivers the highest performance across all metrics, excels in training efficiency (1,000s) and inference time (10s), and demonstrates greater robustness (overall robustness score of 0.79). Consequently, EfficientViT-M2 consumes 63.93 % less power than MobileViTV2 (79.23 W) and 73.26 % less power than SwinTransformer (108.90 W). This highlights its significant advantage in energy efficiency.

LGSep 23, 2024
On-Air Deep Learning Integrated Semantic Inference Models for Enhanced Earth Observation Satellite Networks

Hong-fu Chou, Vu Nguyen Ha, Prabhu Thiruvasagam et al.

Earth Observation (EO) systems are crucial for cartography, disaster surveillance, and resource administration. Nonetheless, they encounter considerable obstacles in the processing and transmission of extensive data, especially in specialized domains such as precision agriculture and real-time disaster response. Earth observation satellites, outfitted with remote sensing technology, gather data from onboard sensors and IoT-enabled terrestrial objects, delivering important information remotely. Domain-adapted Large Language Models (LLMs) provide a solution by enabling the integration of raw and processed EO data. Through domain adaptation, LLMs improve the assimilation and analysis of many data sources, tackling the intricacies of specialized datasets in agriculture and disaster response. This data synthesis, directed by LLMs, enhances the precision and pertinence of conveyed information. This study provides a thorough examination of using semantic inference and deep learning for sophisticated EO systems. It presents an innovative architecture for semantic communication in EO satellite networks, designed to improve data transmission efficiency using semantic processing methodologies. Recent advancements in onboard processing technologies enable dependable, adaptable, and energy-efficient data management in orbit. These improvements guarantee reliable performance in adverse space circumstances using radiation-hardened and reconfigurable technology. Collectively, these advancements enable next-generation satellite missions with improved processing capabilities, crucial for operational flexibility and real-time decision-making in 6G satellite communication.

LGSep 26, 2022
Adaptation of Autoencoder for Sparsity Reduction From Clinical Notes Representation Learning

Thanh-Dung Le, Rita Noumeir, Jerome Rambaud et al.

When dealing with clinical text classification on a small dataset recent studies have confirmed that a well-tuned multilayer perceptron outperforms other generative classifiers, including deep learning ones. To increase the performance of the neural network classifier, feature selection for the learning representation can effectively be used. However, most feature selection methods only estimate the degree of linear dependency between variables and select the best features based on univariate statistical tests. Furthermore, the sparsity of the feature space involved in the learning representation is ignored. Goal: Our aim is therefore to access an alternative approach to tackle the sparsity by compressing the clinical representation feature space, where limited French clinical notes can also be dealt with effectively. Methods: This study proposed an autoencoder learning algorithm to take advantage of sparsity reduction in clinical note representation. The motivation was to determine how to compress sparse, high-dimensional data by reducing the dimension of the clinical note representation feature space. The classification performance of the classifiers was then evaluated in the trained and compressed feature space. Results: The proposed approach provided overall performance gains of up to 3% for each evaluation. Finally, the classifier achieved a 92% accuracy, 91% recall, 91% precision, and 91% f1-score in detecting the patient's condition. Furthermore, the compression working mechanism and the autoencoder prediction process were demonstrated by applying the theoretic information bottleneck framework.

CLJul 27, 2024
The Impact of LoRA Adapters on LLMs for Clinical Text Classification Under Computational and Data Constraints

Thanh-Dung Le, Ti Ti Nguyen, Vu Nguyen Ha et al.

Fine-tuning Large Language Models (LLMs) for clinical Natural Language Processing (NLP) poses significant challenges due to domain gap, limited data, and stringent hardware constraints. In this study, we evaluate four adapter techniques-Adapter, Lightweight, TinyAttention, and Gated Residual Network (GRN) - equivalent to Low-Rank Adaptation (LoRA), for clinical note classification under real-world, resource-constrained conditions. All experiments were conducted on a single NVIDIA Quadro P620 GPU (2 GB VRAM, 512 CUDA cores, 1.386 TFLOPS FP32), limiting batch sizes to <8 sequences and maximum sequence length to 256 tokens. Our clinical corpus comprises only 580 000 tokens, several orders of magnitude smaller than standard LLM pre-training datasets. We fine-tuned three biomedical pre-trained LLMs (CamemBERT-bio, AliBERT, DrBERT) and two lightweight Transformer models trained from scratch. Results show that 1) adapter structures provide no consistent gains when fine-tuning biomedical LLMs under these constraints, and 2) simpler Transformers, with minimal parameter counts and training times under six hours, outperform adapter-augmented LLMs, which required over 1000 GPU-hours. Among adapters, GRN achieved the best metrics (accuracy, precision, recall, F1 = 0.88). These findings demonstrate that, in low-resource clinical settings with limited data and compute, lightweight Transformers trained from scratch offer a more practical and efficient solution than large LLMs, while GRN remains a viable adapter choice when minimal adaptation is needed.

CLMar 22, 2023
Improving Transformer Performance for French Clinical Notes Classification Using Mixture of Experts on a Limited Dataset

Thanh-Dung Le, Philippe Jouvet, Rita Noumeir

Transformer-based models have shown outstanding results in natural language processing but face challenges in applications like classifying small-scale clinical texts, especially with constrained computational resources. This study presents a customized Mixture of Expert (MoE) Transformer models for classifying small-scale French clinical texts at CHU Sainte-Justine Hospital. The MoE-Transformer addresses the dual challenges of effective training with limited data and low-resource computation suitable for in-house hospital use. Despite the success of biomedical pre-trained models such as CamemBERT-bio, DrBERT, and AliBERT, their high computational demands make them impractical for many clinical settings. Our MoE-Transformer model not only outperforms DistillBERT, CamemBERT, FlauBERT, and Transformer models on the same dataset but also achieves impressive results: an accuracy of 87\%, precision of 87\%, recall of 85\%, and F1-score of 86\%. While the MoE-Transformer does not surpass the performance of biomedical pre-trained BERT models, it can be trained at least 190 times faster, offering a viable alternative for settings with limited data and computational resources. Although the MoE-Transformer addresses challenges of generalization gaps and sharp minima, demonstrating some limitations for efficient and accurate clinical text classification, this model still represents a significant advancement in the field. It is particularly valuable for classifying small French clinical narratives within the privacy and constraints of hospital-based computational resources.

LGAug 16, 2023
Label Propagation Techniques for Artifact Detection in Imbalanced Classes using Photoplethysmogram Signals

Clara Macabiau, Thanh-Dung Le, Kevin Albert et al.

This study aimed to investigate the application of label propagation techniques to propagate labels among photoplethysmogram (PPG) signals, particularly in imbalanced class scenarios and limited data availability scenarios, where clean PPG samples are significantly outnumbered by artifact-contaminated samples. We investigated a dataset comprising PPG recordings from 1571 patients, wherein approximately 82% of the samples were identified as clean, while the remaining 18% were contaminated by artifacts. Our research compares the performance of supervised classifiers, such as conventional classifiers and neural networks (Multi-Layer Perceptron (MLP), Transformers, Fully Convolutional Network (FCN)), with the semi-supervised Label Propagation (LP) algorithm for artifact classification in PPG signals. The results indicate that the LP algorithm achieves a precision of 91%, a recall of 90%, and an F1 score of 90% for the "artifacts" class, showcasing its effectiveness in annotating a medical dataset, even in cases where clean samples are rare. Although the K-Nearest Neighbors (KNN) supervised model demonstrated good results with a precision of 89%, a recall of 95%, and an F1 score of 92%, the semi-supervised algorithm excels in artifact detection. In the case of imbalanced and limited pediatric intensive care environment data, the semi-supervised LP algorithm is promising for artifact detection in PPG signals. The results of this study are important for improving the accuracy of PPG-based health monitoring, particularly in situations in which motion artifacts pose challenges to data interpretation

CVJul 18, 2024
Hybrid Deep Learning-Based for Enhanced Occlusion Segmentation in PICU Patient Monitoring

Mario Francisco Munoz, Hoang Vu Huy, Thanh-Dung Le

Remote patient monitoring has emerged as a prominent non-invasive method, using digital technologies and computer vision (CV) to replace traditional invasive monitoring. While neonatal and pediatric departments embrace this approach, Pediatric Intensive Care Units (PICUs) face the challenge of occlusions hindering accurate image analysis and interpretation. \textit{Objective}: In this study, we propose a hybrid approach to effectively segment common occlusions encountered in remote monitoring applications within PICUs. Our approach centers on creating a deep-learning pipeline for limited training data scenarios. \textit{Methods}: First, a combination of the well-established Google DeepLabV3+ segmentation model with the transformer-based Segment Anything Model (SAM) is devised for occlusion segmentation mask proposal and refinement. We then train and validate this pipeline using a small dataset acquired from real-world PICU settings with a Microsoft Kinect camera, achieving an Intersection-over-Union (IoU) metric of 85\%. \textit{Results}: Both quantitative and qualitative analyses underscore the effectiveness of our proposed method. The proposed framework yields an overall classification performance with 92.5\% accuracy, 93.8\% recall, 90.3\% precision, and 92.0\% F1-score. Consequently, the proposed method consistently improves the predictions across all metrics, with an average of 2.75\% gain in performance compared to the baseline CNN-based framework. \textit{Conclusions}: Our proposed hybrid approach significantly enhances the segmentation of occlusions in remote patient monitoring within PICU settings. This advancement contributes to improving the quality of care for pediatric patients, addressing a critical need in clinical practice by ensuring more accurate and reliable remote monitoring.

LGJan 2, 2024
A Novel Transformer-Based Self-Supervised Learning Method to Enhance Photoplethysmogram Signal Artifact Detection

Thanh-Dung Le, Clara Macabiau, Kévin Albert et al.

Recent research at CHU Sainte Justine's Pediatric Critical Care Unit (PICU) has revealed that traditional machine learning methods, such as semi-supervised label propagation and K-nearest neighbors, outperform Transformer-based models in artifact detection from PPG signals, mainly when data is limited. This study addresses the underutilization of abundant unlabeled data by employing self-supervised learning (SSL) to extract latent features from these data, followed by fine-tuning on labeled data. Our experiments demonstrate that SSL significantly enhances the Transformer model's ability to learn representations, improving its robustness in artifact classification tasks. Among various SSL techniques, including masking, contrastive learning, and DINO (self-distillation with no labels)-contrastive learning exhibited the most stable and superior performance in small PPG datasets. Further, we delve into optimizing contrastive loss functions, which are crucial for contrastive SSL. Inspired by InfoNCE, we introduce a novel contrastive loss function that facilitates smoother training and better convergence, thereby enhancing performance in artifact classification. In summary, this study establishes the efficacy of SSL in leveraging unlabeled data, particularly in enhancing the capabilities of the Transformer model. This approach holds promise for broader applications in PICU environments, where annotated data is often limited.

LGMar 12, 2025
A Semantic-Loss Function Modeling Framework With Task-Oriented Machine Learning Perspectives

Ti Ti Nguyen, Thanh-Dung Le, Vu Nguyen Ha et al.

The integration of machine learning (ML) has significantly enhanced the capabilities of Earth Observation (EO) systems by enabling the extraction of actionable insights from complex datasets. However, the performance of data-driven EO applications is heavily influenced by the data collection and transmission processes, where limited satellite bandwidth and latency constraints can hinder the full transmission of original data to the receivers. To address this issue, adopting the concepts of Semantic Communication (SC) offers a promising solution by prioritizing the transmission of essential data semantics over raw information. Implementing SC for EO systems requires a thorough understanding of the impact of data processing and communication channel conditions on semantic loss at the processing center. This work proposes a novel data-fitting framework to empirically model the semantic loss using real-world EO datasets and domain-specific insights. The framework quantifies two primary types of semantic loss: (1) source coding loss, assessed via a data quality indicator measuring the impact of processing on raw source data, and (2) transmission loss, evaluated by comparing practical transmission performance against the Shannon limit. Semantic losses are estimated by evaluating the accuracy of EO applications using four task-oriented ML models, EfficientViT, MobileViT, ResNet50-DINO, and ResNet8-KD, on lossy image datasets under varying channel conditions and compression ratios. These results underpin a framework for efficient semantic-loss modeling in bandwidth-constrained EO scenarios, enabling more reliable and effective operations.

CVOct 31, 2024
Semantic Knowledge Distillation for Onboard Satellite Earth Observation Image Classification

Thanh-Dung Le, Vu Nguyen Ha, Ti Ti Nguyen et al.

This study presents an innovative dynamic weighting knowledge distillation (KD) framework tailored for efficient Earth observation (EO) image classification (IC) in resource-constrained settings. Utilizing EfficientViT and MobileViT as teacher models, this framework enables lightweight student models, particularly ResNet8 and ResNet16, to surpass 90% in accuracy, precision, and recall, adhering to the stringent confidence thresholds necessary for reliable classification tasks. Unlike conventional KD methods that rely on static weight distribution, our adaptive weighting mechanism responds to each teacher model's confidence, allowing student models to prioritize more credible sources of knowledge dynamically. Remarkably, ResNet8 delivers substantial efficiency gains, achieving a 97.5% reduction in parameters, a 96.7% decrease in FLOPs, an 86.2% cut in power consumption, and a 63.5% increase in inference speed over MobileViT. This significant optimization of complexity and resource demands establishes ResNet8 as an optimal candidate for EO tasks, combining robust performance with feasibility in deployment. The confidence-based, adaptable KD approach underscores the potential of dynamic distillation strategies to yield high-performing, resource-efficient models tailored for satellite-based EO applications. The reproducible code is accessible on our GitHub repository.

CLApr 8, 2021
Machine Learning Based on Natural Language Processing to Detect Cardiac Failure in Clinical Narratives

Thanh-Dung Le, Rita Noumeir, Jerome Rambaud et al.

The purpose of the study presented herein is to develop a machine learning algorithm based on natural language processing that automatically detects whether a patient has a cardiac failure or a healthy condition by using physician notes in Research Data Warehouse at CHU Sainte Justine Hospital. First, a word representation learning technique was employed by using bag-of-word (BoW), term frequency inverse document frequency (TFIDF), and neural word embeddings (word2vec). Each representation technique aims to retain the words semantic and syntactic analysis in critical care data. It helps to enrich the mutual information for the word representation and leads to an advantage for further appropriate analysis steps. Second, a machine learning classifier was used to detect the patients condition for either cardiac failure or stable patient through the created word representation vector space from the previous step. This machine learning approach is based on a supervised binary classification algorithm, including logistic regression (LR), Gaussian Naive-Bayes (GaussianNB), and multilayer perceptron neural network (MLPNN). Technically, it mainly optimizes the empirical loss during training the classifiers. As a result, an automatic learning algorithm would be accomplished to draw a high classification performance, including accuracy (acc), precision (pre), recall (rec), and F1 score (f1). The results show that the combination of TFIDF and MLPNN always outperformed other combinations with all overall performance. In the case without any feature selection, the proposed framework yielded an overall classification performance with acc, pre, rec, and f1 of 84% and 82%, 85%, and 83%, respectively. Significantly, if the feature selection was well applied, the overall performance would finally improve up to 4% for each evaluation.

CLApr 8, 2021
Detecting of a Patient's Condition From Clinical Narratives Using Natural Language Representation

Thanh-Dung Le, Rita Noumeir, Jerome Rambaud et al.

The rapid progress in clinical data management systems and artificial intelligence approaches enable the era of personalized medicine. Intensive care units (ICUs) are the ideal clinical research environment for such development because they collect many clinical data and are highly computerized environments. We designed a retrospective clinical study on a prospective ICU database using clinical natural language to help in the early diagnosis of heart failure in critically ill children. The methodology consisted of empirical experiments of a learning algorithm to learn the hidden interpretation and presentation of the French clinical note data. This study included 1386 patients' clinical notes with 5444 single lines of notes. There were 1941 positive cases (36 % of total) and 3503 negative cases classified by two independent physicians using a standardized approach. The multilayer perceptron neural network outperforms other discriminative and generative classifiers. Consequently, the proposed framework yields an overall classification performance with 89 % accuracy, 88 % recall, and 89 % precision. This study successfully applied learning representation and machine learning algorithms to detect heart failure from clinical natural language in a single French institution. Further work is needed to use the same methodology in other institutions and other languages.

SPOct 23, 2018
Reproducing AmbientGAN: Generative models from lossy measurements

Mehdi Ahmadi, Timothy Nest, Mostafa Abdelnaim et al.

In recent years, Generative Adversarial Networks (GANs) have shown substantial progress in modeling complex distributions of data. These networks have received tremendous attention since they can generate implicit probabilistic models that produce realistic data using a stochastic procedure. While such models have proven highly effective in diverse scenarios, they require a large set of fully-observed training samples. In many applications access to such samples are difficult or even impractical and only noisy or partial observations of the desired distribution is available. Recent research has tried to address the problem of incompletely observed samples to recover the distribution of the data. \citep{zhu2017unpaired} and \citep{yeh2016semantic} proposed methods to solve ill-posed inverse problem using cycle-consistency and latent-space mappings in adversarial networks, respectively. \citep{bora2017compressed} and \citep{kabkab2018task} have applied similar adversarial approaches to the problem of compressed sensing. In this work, we focus on a new variant of GAN models called AmbientGAN, which incorporates a measurement process (e.g. adding noise, data removal and projection) into the GAN training. While in the standard GAN, the discriminator distinguishes a generated image from a real image, in AmbientGAN model the discriminator has to separate a real measurement from a simulated measurement of a generated image. The results shown by \citep{bora2018ambientgan} are quite promising for the problem of incomplete data, and have potentially important implications for generative approaches to compressed sensing and ill-posed problems.