Quang-Huy Nguyen

LG
h-index79
12papers
52citations
Novelty48%
AI Score53

12 Papers

CVNov 13, 2022Code
Enhancing Few-shot Image Classification with Cosine Transformer

Quang-Huy Nguyen, Cuong Q. Nguyen, Dung D. Le et al. · cmu, deepmind

This paper addresses the few-shot image classification problem, where the classification task is performed on unlabeled query samples given a small amount of labeled support samples only. One major challenge of the few-shot learning problem is the large variety of object visual appearances that prevents the support samples to represent that object comprehensively. This might result in a significant difference between support and query samples, therefore undermining the performance of few-shot algorithms. In this paper, we tackle the problem by proposing Few-shot Cosine Transformer (FS-CT), where the relational map between supports and queries is effectively obtained for the few-shot tasks. The FS-CT consists of two parts, a learnable prototypical embedding network to obtain categorical representations from support samples with hard cases, and a transformer encoder to effectively achieve the relational map from two different support and query samples. We introduce Cosine Attention, a more robust and stable attention module that enhances the transformer module significantly and therefore improves FS-CT performance from 5% to over 20% in accuracy compared to the default scaled dot-product mechanism. Our method performs competitive results in mini-ImageNet, CUB-200, and CIFAR-FS on 1-shot learning and 5-shot learning tasks across backbones and few-shot configurations. We also developed a custom few-shot dataset for Yoga pose recognition to demonstrate the potential of our algorithm for practical application. Our FS-CT with cosine attention is a lightweight, simple few-shot algorithm that can be applied for a wide range of applications, such as healthcare, medical, and security surveillance. The official implementation code of our Few-shot Cosine Transformer is available at https://github.com/vinuni-vishc/Few-Shot-Cosine-Transformer

35.1LGMay 30
scBatchProx: Federated-Inspired Refinement for Stable Cell-Type Discriminability under Heterogeneous Batch Compositions

Quang-Huy Nguyen, Jiaqi Wang, Wei-Shinn Ku

Single-cell integration workflows often construct low-dimensional cell embeddings and then refine them with post-hoc methods to reduce batch effects. This refinement process can become unstable when cell-type compositions vary across batches, with some populations underrepresented or absent in particular batches. The problem becomes more consequential in dynamic single-cell data systems, where newly acquired batches can change both technical conditions and cell-type composition. Such instability can reduce downstream cell-type classification performance and weaken stability under imbalance perturbations. We introduce scBatchProx, a lightweight post-hoc refinement method for stabilizing single-cell latent embeddings in these heterogeneous and evolving settings. scBatchProx operates directly on precomputed embeddings and treats each batch or study as a client in a federated-inspired optimization procedure. A batch-conditioned FiLM adapter learns local latent updates, while proximal and identity-preserving regularization keep these updates conservative. Experiments on multi-batch and cross-study single-cell datasets show that scBatchProx improves downstream cell-type classification across different upstream embeddings. In controlled imbalance perturbations, scBatchProx maintains more stable affected-cell-type F1 when selected populations are downsampled or ablated from one batch. In cumulative retraining and continual integration settings, scBatchProx remains effective as new datasets arrive over time. Together, these results suggest that conservative, federated-inspired refinement can help maintain stable single-cell embeddings as batch compositions change across datasets and over time.

CLJun 8, 2023
Improving Vietnamese Legal Question--Answering System based on Automatic Data Enrichment

Thi-Hai-Yen Vuong, Ha-Thanh Nguyen, Quang-Huy Nguyen et al.

Question answering (QA) in law is a challenging problem because legal documents are much more complicated than normal texts in terms of terminology, structure, and temporal and logical relationships. It is even more difficult to perform legal QA for low-resource languages like Vietnamese where labeled data are rare and pre-trained language models are still limited. In this paper, we try to overcome these limitations by implementing a Vietnamese article-level retrieval-based legal QA system and introduce a novel method to improve the performance of language models by improving data quality through weak labeling. Our hypothesis is that in contexts where labeled data are limited, efficient data enrichment can help increase overall performance. Our experiments are designed to test multiple aspects, which demonstrate the effectiveness of the proposed technique.

CLDec 16, 2022
Law to Binary Tree -- An Formal Interpretation of Legal Natural Language

Ha-Thanh Nguyen, Vu Tran, Ngoc-Cam Le et al.

Knowledge representation and reasoning in law are essential to facilitate the automation of legal analysis and decision-making tasks. In this paper, we propose a new approach based on legal science, specifically legal taxonomy, for representing and reasoning with legal documents. Our approach interprets the regulations in legal documents as binary trees, which facilitates legal reasoning systems to make decisions and resolve logical contradictions. The advantages of this approach are twofold. First, legal reasoning can be performed on the basis of the binary tree representation of the regulations. Second, the binary tree representation of the regulations is more understandable than the existing sentence-based representations. We provide an example of how our approach can be used to interpret the regulations in a legal document.

60.6IRApr 21Code
From Top-1 to Top-K: A Reproducibility Study and Benchmarking of Counterfactual Explanations for Recommender Systems

Quang-Huy Nguyen, Thanh-Hai Nguyen, Khac-Manh Thai et al.

Counterfactual explanations (CEs) provide an intuitive way to understand recommender systems by identifying minimal modifications to user-item interactions that alter recommendation outcomes. Existing CE methods for recommender systems, however, have been evaluated under heterogeneous protocols, using different datasets, recommenders, metrics, and even explanation formats, which hampers reproducibility and fair comparison. Our paper systematically reproduces, re-implement, and re-evaluate eleven state-of-the-art CE methods for recommender systems, covering both native explainers (e.g., LIME-RS, SHAP, PRINCE, ACCENT, LXR, GREASE) and specific graph-based explainers originally proposed for GNNs. Here, a unified benchmarking framework is proposed to assess explainers along three dimensions: explanation format (implicit vs. explicit), evaluation level (item-level vs. list-level), and perturbation scope (user interaction vectors vs. user-item interaction graphs). Our evaluation protocol includes effectiveness, sparsity, and computational complexity metrics, and extends existing item-level assessments to top-K list-level explanations. Through extensive experiments on three real-world datasets and six representative recommender models, we analyze how well previously reported strengths of CE methods generalize across diverse setups. We observe that the trade-off between effectiveness and sparsity depends strongly on the specific method and evaluation setting, particularly under the explicit format; in addition, explainer performance remains largely consistent across item level and list level evaluations, and several graph-based explainers exhibit notable scalability limitations on large recommender graphs. Our results refine and challenge earlier conclusions about the robustness and practicality of CE generation methods in recommender systems: https://github.com/L2R-UET/CFExpRec.

IVApr 4, 2024Code
LiteNeXt: A Novel Lightweight ConvMixer-based Model with Self-embedding Representation Parallel for Medical Image Segmentation

Ngoc-Du Tran, Thi-Thao Tran, Quang-Huy Nguyen et al.

The emergence of deep learning techniques has advanced the image segmentation task, especially for medical images. Many neural network models have been introduced in the last decade bringing the automated segmentation accuracy close to manual segmentation. However, cutting-edge models like Transformer-based architectures rely on large scale annotated training data, and are generally designed with densely consecutive layers in the encoder, decoder, and skip connections resulting in large number of parameters. Additionally, for better performance, they often be pretrained on a larger data, thus requiring large memory size and increasing resource expenses. In this study, we propose a new lightweight but efficient model, namely LiteNeXt, based on convolutions and mixing modules with simplified decoder, for medical image segmentation. The model is trained from scratch with small amount of parameters (0.71M) and Giga Floating Point Operations Per Second (0.42). To handle boundary fuzzy as well as occlusion or clutter in objects especially in medical image regions, we propose the Marginal Weight Loss that can help effectively determine the marginal boundary between object and background. Additionally, the Self-embedding Representation Parallel technique is proposed as an innovative data augmentation strategy that utilizes the network architecture itself for self-learning augmentation, enhancing feature extraction robustness without external data. Experiments on public datasets including Data Science Bowls, GlaS, ISIC2018, PH2, Sunnybrook, and Lung X-ray data show promising results compared to other state-of-the-art CNN-based and Transformer-based architectures. Our code is released at: https://github.com/tranngocduvnvp/LiteNeXt.

LGNov 26, 2023
Controllable Expensive Multi-objective Learning with Warm-starting Bayesian Optimization

Quang-Huy Nguyen, Long P. Hoang, Hoang V. Viet et al.

Pareto Set Learning (PSL) is a promising approach for approximating the entire Pareto front in multi-objective optimization (MOO) problems. However, existing derivative-free PSL methods are often unstable and inefficient, especially for expensive black-box MOO problems where objective function evaluations are costly. In this work, we propose to address the instability and inefficiency of existing PSL methods with a novel controllable PSL method, called Co-PSL. Particularly, Co-PSL consists of two stages: (1) warm-starting Bayesian optimization to obtain quality Gaussian Processes priors and (2) controllable Pareto set learning to accurately acquire a parametric mapping from preferences to the corresponding Pareto solutions. The former is to help stabilize the PSL process and reduce the number of expensive function evaluations. The latter is to support real-time trade-off control between conflicting objectives. Performances across synthesis and real-world MOO problems showcase the effectiveness of our Co-PSL for expensive multi-objective optimization tasks.

30.7LGMay 7
FedeKD: Energy-Based Gating for Robust Federated Knowledge Distillation under Heterogeneous Settings

Quang-Huy Nguyen, Jiaqi Wang, Wei-shinn Ku

Federated learning (FL) operates in heterogeneous environments, where variations in data distributions and asymmetric model design often result in negative transfer. While federated knowledge distillation (FKD) avoids direct model parameter sharing, existing methods typically rely on public datasets or assume that transferred knowledge is uniformly reliable, which limits their robustness in practice. This paper presents FedeKD, a reliability-aware FKD framework that makes sample-wise trust estimation an explicit component of knowledge transfer, without relying on additional public data. Each client maintains a high-capacity private model for local learning and a lightweight shared proxy model for cross-client knowledge exchange. During training, proxy models are aggregated on the server to form a global proxy, which is then used to guide updates of the private models. At the core of FedeKD is an energy-based gating mechanism that converts task-specific private-proxy disagreement into sample-wise trust weights for backward distillation. This mechanism enables sample-wise weighting of knowledge transfer, where the proxy model contributes more to reliable samples while down-weighting unreliable ones. Extensive experiments on six real-world datasets demonstrate that FedeKD significantly reduces negative transfer under heterogeneous settings while maintaining strong predictive performance.

LGFeb 26
Conformalized Neural Networks for Federated Uncertainty Quantification under Dual Heterogeneity

Quang-Huy Nguyen, Jiaqi Wang, Wei-Shinn Ku

Federated learning (FL) faces challenges in uncertainty quantification (UQ). Without reliable UQ, FL systems risk deploying overconfident models at under-resourced agents, leading to silent local failures despite seemingly satisfactory global performance. Existing federated UQ approaches often address data heterogeneity or model heterogeneity in isolation, overlooking their joint effect on coverage reliability across agents. Conformal prediction is a widely used distribution-free UQ framework, yet its applications in heterogeneous FL settings remains underexplored. We provide FedWQ-CP, a simple yet effective approach that balances empirical coverage performance with efficiency at both global and agent levels under the dual heterogeneity. FedWQ-CP performs agent-server calibration in a single communication round. On each agent, conformity scores are computed on calibration data and a local quantile threshold is derived. Each agent then transmits only its quantile threshold and calibration sample size to the server. The server simply aggregates these thresholds through a weighted average to produce a global threshold. Experimental results on seven public datasets for both classification and regression demonstrate that FedWQ-CP empirically maintains agent-wise and global coverage while producing the smallest prediction sets or intervals.

LGDec 23, 2024
Improving Pareto Set Learning for Expensive Multi-objective Optimization via Stein Variational Hypernetworks

Minh-Duc Nguyen, Phuong Mai Dinh, Quang-Huy Nguyen et al.

Expensive multi-objective optimization problems (EMOPs) are common in real-world scenarios where evaluating objective functions is costly and involves extensive computations or physical experiments. Current Pareto set learning methods for such problems often rely on surrogate models like Gaussian processes to approximate the objective functions. These surrogate models can become fragmented, resulting in numerous small uncertain regions between explored solutions. When using acquisition functions such as the Lower Confidence Bound (LCB), these uncertain regions can turn into pseudo-local optima, complicating the search for globally optimal solutions. To address these challenges, we propose a novel approach called SVH-PSL, which integrates Stein Variational Gradient Descent (SVGD) with Hypernetworks for efficient Pareto set learning. Our method addresses the issues of fragmented surrogate models and pseudo-local optima by collectively moving particles in a manner that smooths out the solution space. The particles interact with each other through a kernel function, which helps maintain diversity and encourages the exploration of underexplored regions. This kernel-based interaction prevents particles from clustering around pseudo-local optima and promotes convergence towards globally optimal solutions. Our approach aims to establish robust relationships between trade-off reference vectors and their corresponding true Pareto solutions, overcoming the limitations of existing methods. Through extensive experiments across both synthetic and real-world MOO benchmarks, we demonstrate that SVH-PSL significantly improves the quality of the learned Pareto set, offering a promising solution for expensive multi-objective optimization problems.

LGMar 12, 2025
Revisiting semi-supervised learning in the era of foundation models

Ping Zhang, Zheda Mai, Quang-Huy Nguyen et al.

Semi-supervised learning (SSL) leverages abundant unlabeled data alongside limited labeled data to enhance learning. As vision foundation models (VFMs) increasingly serve as the backbone of vision applications, it remains unclear how SSL interacts with these pre-trained models. To address this gap, we develop new SSL benchmark datasets where frozen VFMs underperform and systematically evaluate representative SSL methods. We make a surprising observation: parameter-efficient fine-tuning (PEFT) using only labeled data often matches SSL performance, even without leveraging unlabeled data. This motivates us to revisit self-training, a conceptually simple SSL baseline, where we use the supervised PEFT model to pseudo-label unlabeled data for further training. To overcome the notorious issue of noisy pseudo-labels, we propose ensembling multiple PEFT approaches and VFM backbones to produce more robust pseudo-labels. Empirical results validate the effectiveness of this simple yet powerful approach, providing actionable insights into SSL with VFMs and paving the way for more scalable and practical semi-supervised learning in the era of foundation models.

LGFeb 5, 2024
Detecting Out-of-Distribution Objects through Class-Conditioned Inpainting

Quang-Huy Nguyen, Jin Peng Zhou, Zhenzhen Liu et al.

Recent object detectors have achieved impressive accuracy in identifying objects seen during training. However, real-world deployment often introduces novel and unexpected objects, referred to as out-of-distribution (OOD) objects, posing significant challenges to model trustworthiness. Modern object detectors are typically overconfident, making it unreliable to use their predictions alone for OOD detection. To address this, we propose leveraging an auxiliary model as a complementary solution. Specifically, we utilize an off-the-shelf text-to-image generative model, such as Stable Diffusion, which is trained with objective functions distinct from those of discriminative object detectors. We hypothesize that this fundamental difference enables the detection of OOD objects by measuring inconsistencies between the models. Concretely, for a given detected object bounding box and its predicted in-distribution class label, we perform class-conditioned inpainting on the image with the object removed. If the object is OOD, the inpainted image is likely to deviate significantly from the original, making the reconstruction error a robust indicator of OOD status. Extensive experiments demonstrate that our approach consistently surpasses existing zero-shot and non-zero-shot OOD detection methods, establishing a robust framework for enhancing object detection systems in dynamic environments.