Quan Nguyen

LG
h-index41
48papers
629citations
Novelty51%
AI Score57

48 Papers

ROSep 24, 2024
Autotuning Bipedal Locomotion MPC with GRFM-Net for Efficient Sim-to-Real Transfer

Qianzhong Chen, Junheng Li, Sheng Cheng et al. · stanford

Bipedal locomotion control is essential for humanoid robots to navigate complex, human-centric environments. While optimization-based control designs are popular for integrating sophisticated models of humanoid robots, they often require labor-intensive manual tuning. In this work, we address the challenges of parameter selection in bipedal locomotion control using DiffTune, a model-based autotuning method that leverages differential programming for efficient parameter learning. A major difficulty lies in balancing model fidelity with differentiability. We address this difficulty using a low-fidelity model for differentiability, enhanced by a Ground Reaction Force-and-Moment Network (GRFM-Net) to capture discrepancies between MPC commands and actual control effects. We validate the parameters learned by DiffTune with GRFM-Net in hardware experiments, which demonstrates the parameters' optimality in a multi-objective setting compared with baseline parameters, reducing the total loss by up to 40.5$\%$ compared with the expert-tuned parameters. The results confirm the GRFM-Net's effectiveness in mitigating the sim-to-real gap, improving the transferability of simulation-learned parameters to real hardware.

CVNov 20, 2022
FedDCT: Federated Learning of Large Convolutional Neural Networks on Resource Constrained Devices using Divide and Collaborative Training

Quan Nguyen, Hieu H. Pham, Kok-Seng Wong et al. · cmu, deepmind

We introduce FedDCT, a novel distributed learning paradigm that enables the usage of large, high-performance CNNs on resource-limited edge devices. As opposed to traditional FL approaches, which require each client to train the full-size neural network independently during each training round, the proposed FedDCT allows a cluster of several clients to collaboratively train a large deep learning model by dividing it into an ensemble of several small sub-models and train them on multiple devices in parallel while maintaining privacy. In this collaborative training process, clients from the same cluster can also learn from each other, further improving their ensemble performance. In the aggregation stage, the server takes a weighted average of all the ensemble models trained by all the clusters. FedDCT reduces the memory requirements and allows low-end devices to participate in FL. We empirically conduct extensive experiments on standardized datasets, including CIFAR-10, CIFAR-100, and two real-world medical datasets HAM10000 and VAIPE. Experimental results show that FedDCT outperforms a set of current SOTA FL methods with interesting convergence behaviors. Furthermore, compared to other existing approaches, FedDCT achieves higher accuracy and substantially reduces the number of communication rounds (with $4-8$ times fewer memory requirements) to achieve the desired accuracy on the testing dataset without incurring any extra training cost on the server side.

LGOct 21, 2022
Local Bayesian optimization via maximizing probability of descent

Quan Nguyen, Kaiwen Wu, Jacob R. Gardner et al.

Local optimization presents a promising approach to expensive, high-dimensional black-box optimization by sidestepping the need to globally explore the search space. For objective functions whose gradient cannot be evaluated directly, Bayesian optimization offers one solution -- we construct a probabilistic model of the objective, design a policy to learn about the gradient at the current location, and use the resulting information to navigate the objective landscape. Previous work has realized this scheme by minimizing the variance in the estimate of the gradient, then moving in the direction of the expected gradient. In this paper, we re-examine and refine this approach. We demonstrate that, surprisingly, the expected value of the gradient is not always the direction maximizing the probability of descent, and in fact, these directions may be nearly orthogonal. This observation then inspires an elegant optimization scheme seeking to maximize the probability of descent while moving in the direction of most-probable descent. Experiments on both synthetic and real-world objectives show that our method outperforms previous realizations of this optimization scheme and is competitive against other, significantly more complicated baselines.

LGMay 28
Q-ANCHOR: Federated Quantum Learning with ZNE-guided Correction

Hoang M. Ngo, Quan Nguyen, Wanli Xing et al.

Quantum Federated Learning (QFL) offers a promising framework to train quantum models across distributed clients while keeping data strictly local. Due to its simplicity and low communication overhead, Federated Averaging (FedAvg) is the standard aggregation choice in QFL literature. However, deploying QFL on practical hardware exposes a severe double-drift phenomenon: the global model is simultaneously derailed by client drift from non-IID data and hardware bias from noisy quantum gradient estimates. In this work, we first analyze the convergence of FedAvg under these realistic conditions, mathematically demonstrating that quantum hardware bias creates a persistent error floor that standard averaging cannot correct. To overcome this limitation, we propose Q-ANCHOR, a quantum-aware federated aggregation architecture that anchors server updates with zero-noise extrapolation while applying stateful client correction to suppress both client drift and hardware-induced bias. Our convergence theory proves that Q-ANCHOR successfully mitigates classical client drift while actively reducing the hardware-bias floor. Experimental results demonstrate that Q-ANCHOR achieves significantly more stable training than conventional FL baselines.

ROOct 2, 2023
Generalized Animal Imitator: Agile Locomotion with Versatile Motion Prior

Ruihan Yang, Zhuoqun Chen, Jianhan Ma et al.

The agility of animals, particularly in complex activities such as running, turning, jumping, and backflipping, stands as an exemplar for robotic system design. Transferring this suite of behaviors to legged robotic systems introduces essential inquiries: How can a robot learn multiple locomotion behaviors simultaneously? How can the robot execute these tasks with a smooth transition? How to integrate these skills for wide-range applications? This paper introduces the Versatile Instructable Motion prior (VIM) - a Reinforcement Learning framework designed to incorporate a range of agile locomotion tasks suitable for advanced robotic applications. Our framework enables legged robots to learn diverse agile low-level skills by imitating animal motions and manually designed motions. Our Functionality reward guides the robot's ability to adopt varied skills, and our Stylization reward ensures that robot motions align with reference motions. Our evaluations of the VIM framework span both simulation and the real world. Our framework allows a robot to concurrently learn diverse agile locomotion skills using a single learning-based controller in the real world. Videos can be found on our website: https://rchalyang.github.io/VIM/

LGJan 11, 2023
Adversarial Online Multi-Task Reinforcement Learning

Quan Nguyen, Nishant A. Mehta

We consider the adversarial online multi-task reinforcement learning setting, where in each of $K$ episodes the learner is given an unknown task taken from a finite set of $M$ unknown finite-horizon MDP models. The learner's objective is to minimize its regret with respect to the optimal policy for each task. We assume the MDPs in $\mathcal{M}$ are well-separated under a notion of $λ$-separability, and show that this notion generalizes many task-separability notions from previous works. We prove a minimax lower bound of $Ω(K\sqrt{DSAH})$ on the regret of any learning algorithm and an instance-specific lower bound of $Ω(\frac{K}{λ^2})$ in sample complexity for a class of uniformly-good cluster-then-learn algorithms. We use a novel construction called 2-JAO MDP for proving the instance-specific lower bound. The lower bounds are complemented with a polynomial time algorithm that obtains $\tilde{O}(\frac{K}{λ^2})$ sample complexity guarantee for the clustering phase and $\tilde{O}(\sqrt{MK})$ regret guarantee for the learning phase, indicating that the dependency on $K$ and $\frac{1}{λ^2}$ is tight.

LGAug 30, 2023
Segmenting mechanically heterogeneous domains via unsupervised learning

Quan Nguyen, Emma Lejeune

From biological organs to soft robotics, highly deformable materials are essential components of natural and engineered systems. These highly deformable materials can have heterogeneous material properties, and can experience heterogeneous deformations with or without underlying material heterogeneity. Many recent works have established that computational modeling approaches are well suited for understanding and predicting the consequences of material heterogeneity and for interpreting observed heterogeneous strain fields. In particular, there has been significant work towards developing inverse analysis approaches that can convert observed kinematic quantities (e.g., displacement, strain) to material properties and mechanical state. Despite the success of these approaches, they are not necessarily generalizable and often rely on tight control and knowledge of boundary conditions. Here, we will build on the recent advances (and ubiquity) of machine learning approaches to explore alternative approaches to detect patterns in heterogeneous material properties and mechanical behavior. Specifically, we will explore unsupervised learning approaches to clustering and ensemble clutering to identify heterogeneous regions. Overall, we find that these approaches are effective, yet limited in their abilities. Through this initial exploration (where all data and code is published alongside this manuscript), we set the stage for future studies that more specifically adapt these methods to mechanical data.

LGFeb 3
Q-ShiftDP: A Differentially Private Parameter-Shift Rule for Quantum Machine Learning

Hoang M. Ngo, Nhat Hoang-Xuan, Quan Nguyen et al.

Quantum Machine Learning (QML) promises significant computational advantages, but preserving training data privacy remains challenging. Classical approaches like differentially private stochastic gradient descent (DP-SGD) add noise to gradients but fail to exploit the unique properties of quantum gradient estimation. In this work, we introduce the Differentially Private Parameter-Shift Rule (Q-ShiftDP), the first privacy mechanism tailored to QML. By leveraging the inherent boundedness and stochasticity of quantum gradients computed via the parameter-shift rule, Q-ShiftDP enables tighter sensitivity analysis and reduces noise requirements. We combine carefully calibrated Gaussian noise with intrinsic quantum noise to provide formal privacy and utility guarantees, and show that harnessing quantum noise further improves the privacy-utility trade-off. Experiments on benchmark datasets demonstrate that Q-ShiftDP consistently outperforms classical DP methods in QML.

LGMay 11
The Geometric Wall: Manifold Structure Predicts Layerwise Sparse Autoencoder Scaling Laws

Eslam Zaher, Maciej Trzaskowski, Quan Nguyen et al.

Sparse autoencoders (SAEs) operationalise the linear representation hypothesis: they reconstruct model activations as sparse linear combinations of interpretable dictionary atoms, on the implicit assumption that activation space is well approximated by a globally linear structure. Their reconstruction error varies sharply across layers in ways that existing scaling laws, fitted at single layers, do not explain. We argue that this variation is the empirical trace of a geometric mismatch: where the activation manifold is curved and its intrinsic dimension varies across layers, no sparse linear dictionary can match it uniformly, and the SAE's width-sparsity scaling becomes a layer-dependent function of manifold structure rather than a single universal law. We conduct the first cross-layer SAE scaling study, fitting and regressing on 844 residual-stream Gemma Scope SAE checkpoints across 68 layers of Gemma 2 2B and 9B. Stage 1 fits a per-layer scaling-law surface; Stage 2 regresses the fitted parameters and the derived per-layer width exponents on four layerwise geometric summaries. We find that manifold geometry predicts the per-layer width exponent in both models, and that the same regression coefficients learnt on one model predict the other model's per-layer exponents under cross-model transfer, indicating a transferable geometric law. At the showcase layers where richer width grids permit identification of the asymptotic floor, we find that the fitted floor tracks the layerwise geometric ordering: higher curvature and intrinsic dimension correspond to higher floor, consistent with the irreducible second-order residual that any sparse linear approximation of a curved manifold must leave behind. SAEs thus encounter not a finite-resource ceiling but a geometry-dependent wall, set by the manifold they are trying to reconstruct.

ROApr 24
Learning-augmented robotic automation for real-world manufacturing

Yunho Kim, Quan Nguyen, Taewhan Kim et al.

Industrial robots are widely used in manufacturing, yet most manipulation still depends on fixed waypoint scripts that are brittle to environmental changes. Learning-based control offers a more adaptive alternative, but it remains unclear whether such methods, still mostly confined to laboratory demonstrations, can sustain hours of reliable operation, deliver consistent quality, and behave safely around people on a live production line. Here we present Learning-Augmented Robotic Automation, a hybrid system that integrates learned task controllers and a neural 3D safety monitor into conventional industrial workflows. We deployed the system on an electric-motor production line to automate deformable cable insertion and soldering under real manufacturing constraints, a step previously performed manually by human workers. With less than 20 min of real-world data per task, the system operated continuously for 5 h 10 min, producing 108 motors without physical fencing and achieving a 99.4% pass rate on product-level quality-control tests. It maintained near-human takt time while reducing variability in solder-joint quality and cycle time. These results establish a practical pathway for extending industrial automation with learning-based methods.

LGJan 26
Counterfactual Explanations on Robust Perceptual Geodesics

Eslam Zaher, Maciej Trzaskowski, Quan Nguyen et al.

Latent-space optimization methods for counterfactual explanations - framed as minimal semantic perturbations that change model predictions - inherit the ambiguity of Wachter et al.'s objective: the choice of distance metric dictates whether perturbations are meaningful or adversarial. Existing approaches adopt flat or misaligned geometries, leading to off-manifold artifacts, semantic drift, or adversarial collapse. We introduce Perceptual Counterfactual Geodesics (PCG), a method that constructs counterfactuals by tracing geodesics under a perceptually Riemannian metric induced from robust vision features. This geometry aligns with human perception and penalizes brittle directions, enabling smooth, on-manifold, semantically valid transitions. Experiments on three vision datasets show that PCG outperforms baselines and reveals failure modes hidden under standard metrics.

MLMay 3, 2024
Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design

Quan Nguyen, Adji Bousso Dieng

Experimental design techniques such as active search and Bayesian optimization are widely used in the natural sciences for data collection and discovery. However, existing techniques tend to favor exploitation over exploration of the search space, which causes them to get stuck in local optima. This ``collapse" problem prevents experimental design algorithms from yielding diverse high-quality data. In this paper, we extend the Vendi scores -- a family of interpretable similarity-based diversity metrics -- to account for quality. We then leverage these quality-weighted Vendi scores to tackle experimental design problems across various applications, including drug discovery, materials discovery, and reinforcement learning. We found that quality-weighted Vendi scores allow us to construct policies for experimental design that flexibly balance quality and diversity, and ultimately assemble rich and diverse sets of high-performing data points. Our algorithms led to a 70%-170% increase in the number of effective discoveries compared to baselines.

CLDec 18, 2023
VinaLLaMA: LLaMA-based Vietnamese Foundation Model

Quan Nguyen, Huy Pham, Dung Dao

In this technical report, we present VinaLLaMA, an open-weight, state-of-the-art (SOTA) Large Language Model for the Vietnamese language, built upon LLaMA-2 with an additional 800 billion trained tokens. VinaLLaMA not only demonstrates fluency in Vietnamese but also exhibits a profound understanding of Vietnamese culture, making it a truly indigenous model. VinaLLaMA-7B-chat, trained on 1 million high-quality synthetic samples, achieves SOTA results on key benchmarks, including VLSP, VMLU, and Vicuna Benchmark Vietnamese, marking a significant advancement in the Vietnamese AI landscape and offering a versatile resource for various applications.

CLFeb 18, 2025
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

Longxu Dou, Qian Liu, Fan Zhou et al.

Sailor2 is a family of cutting-edge multilingual language models for South-East Asian (SEA) languages, available in 1B, 8B, and 20B sizes to suit diverse applications. Building on Qwen2.5, Sailor2 undergoes continuous pre-training on 500B tokens (400B SEA-specific and 100B replay tokens) to support 13 SEA languages while retaining proficiency in Chinese and English. Sailor2-20B model achieves a 50-50 win rate against GPT-4o across SEA languages. We also deliver a comprehensive cookbook on how to develop the multilingual model in an efficient manner, including five key aspects: data curation, pre-training, post-training, model customization and evaluation. We hope that Sailor2 model (Apache 2.0 license) will drive language development in the SEA region, and Sailor2 cookbook will inspire researchers to build more inclusive LLMs for other under-served languages.

LGMay 16, 2024
Manifold Integrated Gradients: Riemannian Geometry for Feature Attribution

Eslam Zaher, Maciej Trzaskowski, Quan Nguyen et al.

In this paper, we dive into the reliability concerns of Integrated Gradients (IG), a prevalent feature attribution method for black-box deep learning models. We particularly address two predominant challenges associated with IG: the generation of noisy feature visualizations for vision models and the vulnerability to adversarial attributional attacks. Our approach involves an adaptation of path-based feature attribution, aligning the path of attribution more closely to the intrinsic geometry of the data manifold. Our experiments utilise deep generative models applied to several real-world image datasets. They demonstrate that IG along the geodesics conforms to the curved geometry of the Riemannian data manifold, generating more perceptually intuitive explanations and, subsequently, substantially increasing robustness to targeted attributional attacks.

CVApr 12, 2025
A Lightweight Moment Retrieval System with Global Re-Ranking and Robust Adaptive Bidirectional Temporal Search

Tinh-Anh Nguyen-Nhu, Huu-Loc Tran, Nguyen-Khang Le et al.

The exponential growth of digital video content has posed critical challenges in moment-level video retrieval, where existing methodologies struggle to efficiently localize specific segments within an expansive video corpus. Current retrieval systems are constrained by computational inefficiencies, temporal context limitations, and the intrinsic complexity of navigating video content. In this paper, we address these limitations through a novel Interactive Video Corpus Moment Retrieval framework that integrates a SuperGlobal Reranking mechanism and Adaptive Bidirectional Temporal Search (ABTS), strategically optimizing query similarity, temporal stability, and computational resources. By preprocessing a large corpus of videos using a keyframe extraction model and deduplication technique through image hashing, our approach provides a scalable solution that significantly reduces storage requirements while maintaining high localization precision across diverse video repositories.

CVApr 14, 2025
HDC: Hierarchical Distillation for Multi-level Noisy Consistency in Semi-Supervised Fetal Ultrasound Segmentation

Tran Quoc Khanh Le, Nguyen Lan Vi Vu, Ha-Hieu Pham et al.

Transvaginal ultrasound is a critical imaging modality for evaluating cervical anatomy and detecting physiological changes. However, accurate segmentation of cervical structures remains challenging due to low contrast, shadow artifacts, and indistinct boundaries. While convolutional neural networks (CNNs) have demonstrated efficacy in medical image segmentation, their reliance on large-scale annotated datasets presents a significant limitation in clinical ultrasound imaging. Semi-supervised learning (SSL) offers a potential solution by utilizing unlabeled data, yet existing teacher-student frameworks often encounter confirmation bias and high computational costs. In this paper, a novel semi-supervised segmentation framework, called HDC, is proposed incorporating adaptive consistency learning with a single-teacher architecture. The framework introduces a hierarchical distillation mechanism with two objectives: Correlation Guidance Loss for aligning feature representations and Mutual Information Loss for stabilizing noisy student learning. The proposed approach reduces model complexity while enhancing generalization. Experiments on fetal ultrasound datasets, FUGC and PSFH, demonstrate competitive performance with reduced computational overhead compared to multi-teacher models.

CVApr 14, 2025
IGL-DT: Iterative Global-Local Feature Learning with Dual-Teacher Semantic Segmentation Framework under Limited Annotation Scheme

Dinh Dai Quan Tran, Hoang-Thien Nguyen, Thanh-Huy Nguyen et al.

Semi-Supervised Semantic Segmentation (SSSS) aims to improve segmentation accuracy by leveraging a small set of labeled images alongside a larger pool of unlabeled data. Recent advances primarily focus on pseudo-labeling, consistency regularization, and co-training strategies. However, existing methods struggle to balance global semantic representation with fine-grained local feature extraction. To address this challenge, we propose a novel tri-branch semi-supervised segmentation framework incorporating a dual-teacher strategy, named IGL-DT. Our approach employs SwinUnet for high-level semantic guidance through Global Context Learning and ResUnet for detailed feature refinement via Local Regional Learning. Additionally, a Discrepancy Learning mechanism mitigates over-reliance on a single teacher, promoting adaptive feature learning. Extensive experiments on benchmark datasets demonstrate that our method outperforms state-of-the-art approaches, achieving superior segmentation performance across various data regimes.

CVApr 11, 2025
Towards Efficient and Robust Moment Retrieval System: A Unified Framework for Multi-Granularity Models and Temporal Reranking

Huu-Loc Tran, Tinh-Anh Nguyen-Nhu, Huu-Phong Phan-Nguyen et al.

Long-form video understanding presents significant challenges for interactive retrieval systems, as conventional methods struggle to process extensive video content efficiently. Existing approaches often rely on single models, inefficient storage, unstable temporal search, and context-agnostic reranking, limiting their effectiveness. This paper presents a novel framework to enhance interactive video retrieval through four key innovations: (1) an ensemble search strategy that integrates coarse-grained (CLIP) and fine-grained (BEIT3) models to improve retrieval accuracy, (2) a storage optimization technique that reduces redundancy by selecting representative keyframes via TransNetV2 and deduplication, (3) a temporal search mechanism that localizes video segments using dual queries for start and end points, and (4) a temporal reranking approach that leverages neighboring frame context to stabilize rankings. Evaluated on known-item search and question-answering tasks, our framework demonstrates substantial improvements in retrieval precision, efficiency, and user interpretability, offering a robust solution for real-world interactive video retrieval applications.

CLMar 15, 2025
Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

Da Wu, Zhanliang Wang, Quan Nguyen et al.

Background: Several studies show that large language models (LLMs) struggle with phenotype-driven gene prioritization for rare diseases. These studies typically use Human Phenotype Ontology (HPO) terms to prompt foundation models like GPT and LLaMA to predict candidate genes. However, in real-world settings, foundation models are not optimized for domain-specific tasks like clinical diagnosis, yet inputs are unstructured clinical notes rather than standardized terms. How LLMs can be instructed to predict candidate genes or disease diagnosis from unstructured clinical notes remains a major challenge. Methods: We introduce RAG-driven CoT and CoT-driven RAG, two methods that combine Chain-of-Thought (CoT) and Retrieval Augmented Generation (RAG) to analyze clinical notes. A five-question CoT protocol mimics expert reasoning, while RAG retrieves data from sources like HPO and OMIM (Online Mendelian Inheritance in Man). We evaluated these approaches on rare disease datasets, including 5,980 Phenopacket-derived notes, 255 literature-based narratives, and 220 in-house clinical notes from Childrens Hospital of Philadelphia. Results: We found that recent foundations models, including Llama 3.3-70B-Instruct and DeepSeek-R1-Distill-Llama-70B, outperformed earlier versions such as Llama 2 and GPT-3.5. We also showed that RAG-driven CoT and CoT-driven RAG both outperform foundation models in candidate gene prioritization from clinical notes; in particular, both methods with DeepSeek backbone resulted in a top-10 gene accuracy of over 40% on Phenopacket-derived clinical notes. RAG-driven CoT works better for high-quality notes, where early retrieval can anchor the subsequent reasoning steps in domain-specific evidence, while CoT-driven RAG has advantage when processing lengthy and noisy notes.

LGMay 23, 2024
Amortized nonmyopic active search via deep imitation learning

Quan Nguyen, Anindya Sarkar, Roman Garnett

Active search formalizes a specialized active learning setting where the goal is to collect members of a rare, valuable class. The state-of-the-art algorithm approximates the optimal Bayesian policy in a budget-aware manner, and has been shown to achieve impressive empirical performance in previous work. However, even this approximate policy has a superlinear computational complexity with respect to the size of the search problem, rendering its application impractical in large spaces or in real-time systems where decisions must be made quickly. We study the amortization of this policy by training a neural network to learn to search. To circumvent the difficulty of learning from scratch, we appeal to imitation learning techniques to mimic the behavior of the expert, expensive-to-compute policy. Our policy network, trained on synthetic data, learns a beneficial search strategy that yields nonmyopic decisions carefully balancing exploration and exploitation. Extensive experiments demonstrate our policy achieves competitive performance at real-world tasks that closely approximates the expert's at a fraction of the cost, while outperforming cheaper baselines.

CVAug 2, 2025
GMAT: Grounded Multi-Agent Clinical Description Generation for Text Encoder in Vision-Language MIL for Whole Slide Image Classification

Ngoc Bui Lam Quang, Nam Le Nguyen Binh, Thanh-Huy Nguyen et al.

Multiple Instance Learning (MIL) is the leading approach for whole slide image (WSI) classification, enabling efficient analysis of gigapixel pathology slides. Recent work has introduced vision-language models (VLMs) into MIL pipelines to incorporate medical knowledge through text-based class descriptions rather than simple class names. However, when these methods rely on large language models (LLMs) to generate clinical descriptions or use fixed-length prompts to represent complex pathology concepts, the limited token capacity of VLMs often constrains the expressiveness and richness of the encoded class information. Additionally, descriptions generated solely by LLMs may lack domain grounding and fine-grained medical specificity, leading to suboptimal alignment with visual features. To address these challenges, we propose a vision-language MIL framework with two key contributions: (1) A grounded multi-agent description generation system that leverages curated pathology textbooks and agent specialization (e.g., morphology, spatial context) to produce accurate and diverse clinical descriptions; (2) A text encoding strategy using a list of descriptions rather than a single prompt, capturing fine-grained and complementary clinical signals for better alignment with visual features. Integrated into a VLM-MIL pipeline, our approach shows improved performance over single-prompt class baselines and achieves results comparable to state-of-the-art models, as demonstrated on renal and lung cancer datasets.

QMMay 9, 2025
Multimodal Integrated Knowledge Transfer to Large Language Models through Preference Optimization with Biomedical Applications

Da Wu, Zhanliang Wang, Quan Nguyen et al.

The scarcity of high-quality multimodal biomedical data limits the ability to effectively fine-tune pretrained Large Language Models (LLMs) for specialized biomedical tasks. To address this challenge, we introduce MINT (Multimodal Integrated kNowledge Transfer), a framework that aligns unimodal large decoder models with domain-specific decision patterns from multimodal biomedical data through preference optimization. While MINT supports different optimization techniques, we primarily implement it with the Odds Ratio Preference Optimization (ORPO) framework as its backbone. This strategy enables the aligned LLMs to perform predictive tasks using text-only or image-only inputs while retaining knowledge learnt from multimodal data. MINT leverages an upstream multimodal machine learning (MML) model trained on high-quality multimodal data to transfer domain-specific insights to downstream text-only or image-only LLMs. We demonstrate its effectiveness through two key applications: (1) Rare genetic disease prediction from texts, where MINT uses a multimodal encoder model, trained on facial photos and clinical notes, to generate a preference dataset for aligning a lightweight Llama 3.2-3B-Instruct. Despite relying on text input only, the MINT-derived model outperforms models trained with SFT, RAG, or DPO, and even outperforms Llama 3.1-405B-Instruct. (2) Tissue type classification using cell nucleus images, where MINT uses a vision-language foundation model as the preference generator, containing knowledge learnt from both text and histopathological images to align downstream image-only models. The resulting MINT-derived model significantly improves the performance of Llama 3.2-Vision-11B-Instruct on tissue type classification. In summary, MINT provides an effective strategy to align unimodal LLMs with high-quality multimodal expertise through preference optimization.

LGMay 21, 2025
How Transformers Learn In-Context Recall Tasks? Optimality, Training Dynamics and Generalization

Quan Nguyen, Thanh Nguyen-Tang

We study the approximation capabilities, convergence speeds and on-convergence behaviors of transformers trained on in-context recall tasks -- which requires to recognize the \emph{positional} association between a pair of tokens from in-context examples. Existing theoretical results only focus on the in-context reasoning behavior of transformers after being trained for the \emph{one} gradient descent step. It remains unclear what is the on-convergence behavior of transformers being trained by gradient descent and how fast the convergence rate is. In addition, the generalization of transformers in one-step in-context reasoning has not been formally investigated. This work addresses these gaps. We first show that a class of transformers with either linear, ReLU or softmax attentions, is provably Bayes-optimal for an in-context recall task. When being trained with gradient descent, we show via a finite-sample analysis that the expected loss converges at linear rate to the Bayes risks. Moreover, we show that the trained transformers exhibit out-of-distribution (OOD) generalization, i.e., generalizing to samples outside of the population distribution. Our theoretical findings are further supported by extensive empirical validations, showing that \emph{without} proper parameterization, models with larger expressive power surprisingly \emph{fail} to generalize OOD after being trained by gradient descent.

LGMar 2, 2024
Near-optimal Per-Action Regret Bounds for Sleeping Bandits

Quan Nguyen, Nishant A. Mehta

We derive near-optimal per-action regret bounds for sleeping bandits, in which both the sets of available arms and their losses in every round are chosen by an adversary. In a setting with $K$ total arms and at most $A$ available arms in each round over $T$ rounds, the best known upper bound is $O(K\sqrt{TA\ln{K}})$, obtained indirectly via minimizing internal sleeping regrets. Compared to the minimax $Ω(\sqrt{TA})$ lower bound, this upper bound contains an extra multiplicative factor of $K\ln{K}$. We address this gap by directly minimizing the per-action regret using generalized versions of EXP3, EXP3-IX and FTRL with Tsallis entropy, thereby obtaining near-optimal bounds of order $O(\sqrt{TA\ln{K}})$ and $O(\sqrt{T\sqrt{AK}})$. We extend our results to the setting of bandits with advice from sleeping experts, generalizing EXP4 along the way. This leads to new proofs for a number of existing adaptive and tracking regret bounds for standard non-sleeping bandits. Extending our results to the bandit version of experts that report their confidences leads to new bounds for the confidence regret that depends primarily on the sum of experts' confidences. We prove a lower bound, showing that for any minimax optimal algorithms, there exists an action whose regret is sublinear in $T$ but linear in the number of its active rounds.

LGOct 3, 2025
How to Set $β_1, β_2$ in Adam: An Online Learning Perspective

Quan Nguyen

While Adam is one of the most effective optimizer for training large-scale machine learning models, a theoretical understanding of how to optimally set its momentum factors, $β_1$ and $β_2$, remains largely incomplete. Prior works have shown that Adam can be seen as an instance of Follow-the-Regularized-Leader (FTRL), one of the most important class of algorithms in online learning. The prior analyses in these works required setting $β_1 = \sqrt{β_2}$, which does not cover the more practical cases with $β_1 \neq \sqrt{β_2}$. We derive novel, more general analyses that hold for both $β_1 \geq \sqrt{β_2}$ and $β_1 \leq \sqrt{β_2}$. In both cases, our results strictly generalize the existing bounds. Furthermore, we show that our bounds are tight in the worst case. We also prove that setting $β_1 = \sqrt{β_2}$ is optimal for an oblivious adversary, but sub-optimal for an non-oblivious adversary.

LGSep 12, 2025
Vendi Information Gain for Active Learning and its Application to Ecology

Quan Nguyen, Adji Bousso Dieng

While monitoring biodiversity through camera traps has become an important endeavor for ecological research, identifying species in the captured image data remains a major bottleneck due to limited labeling resources. Active learning -- a machine learning paradigm that selects the most informative data to label and train a predictive model -- offers a promising solution, but typically focuses on uncertainty in the individual predictions without considering uncertainty across the entire dataset. We introduce a new active learning policy, Vendi information gain (VIG), that selects images based on their impact on dataset-wide prediction uncertainty, capturing both informativeness and diversity. We applied VIG to the Snapshot Serengeti dataset and compared it against common active learning methods. VIG needs only 3% of the available data to reach 75% accuracy, a level that baselines require more than 10% of the data to achieve. With 10% of the data, VIG attains 88% predictive accuracy, 12% higher than the best of the baselines. This improvement in performance is consistent across metrics and batch sizes, and we show that VIG also collects more diverse data in the feature space. VIG has broad applicability beyond ecology, and our results highlight its value for biodiversity monitoring in data-limited environments.

CRJun 16, 2025
Theoretically Unmasking Inference Attacks Against LDP-Protected Clients in Federated Vision Models

Quan Nguyen, Minh N. Vu, Truc Nguyen et al.

Federated Learning enables collaborative learning among clients via a coordinating server while avoiding direct data sharing, offering a perceived solution to preserve privacy. However, recent studies on Membership Inference Attacks (MIAs) have challenged this notion, showing high success rates against unprotected training data. While local differential privacy (LDP) is widely regarded as a gold standard for privacy protection in data analysis, most studies on MIAs either neglect LDP or fail to provide theoretical guarantees for attack success rates against LDP-protected data. To address this gap, we derive theoretical lower bounds for the success rates of low-polynomial time MIAs that exploit vulnerabilities in fully connected or self-attention layers. We establish that even when data are protected by LDP, privacy risks persist, depending on the privacy budget. Practical evaluations on federated vision models confirm considerable privacy risks, revealing that the noise required to mitigate these attacks significantly degrades models' utility.

LGMay 25, 2025
Reduce Computational Cost In Deep Reinforcement Learning Via Randomized Policy Learning

Zhuochen Liu, Rahul Jain, Quan Nguyen

Recent advancements in reinforcement learning (RL) have leveraged neural networks to achieve state-of-the-art performance across various control tasks. However, these successes often come at the cost of significant computational resources, as training deep neural networks requires substantial time and data. In this paper, we introduce an actor-critic algorithm that utilizes randomized neural networks to drastically reduce computational costs while maintaining strong performance. Despite its simple architecture, our method effectively solves a range of control problems, including the locomotion control of a highly dynamic 12-motor quadruped robot, and achieves results comparable to leading algorithms such as Proximal Policy Optimization (PPO). Notably, our approach does not outperform other algorithms in terms of sample efficnency but rather in terms of wall-clock training time. That is, although our algorithm requires more timesteps to converge to an optimal policy, the actual time required for training turns out to be lower.

CVApr 12, 2025
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection

Huu-Phong Phan-Nguyen, Anh Dao, Tien-Huy Nguyen et al.

Nowadays, smartphones are ubiquitous, and almost everyone owns one. At the same time, the rapid development of AI has spurred extensive research on applying deep learning techniques to image classification. However, due to the limited resources available on mobile devices, significant challenges remain in balancing accuracy with computational efficiency. In this paper, we propose a novel training framework called Cycle Training, which adopts a three-stage training process that alternates between exploration and stabilization phases to optimize model performance. Additionally, we incorporate Semi-Supervised Domain Adaptation (SSDA) to leverage the power of large models and unlabeled data, thereby effectively expanding the training dataset. Comprehensive experiments on the CamSSD dataset for mobile scene detection demonstrate that our framework not only significantly improves classification accuracy but also ensures real-time inference efficiency. Specifically, our method achieves a 94.00% in Top-1 accuracy and a 99.17% in Top-3 accuracy and runs inference in just 1.61ms using CPU, demonstrating its suitability for real-world mobile deployment.

LGFeb 12, 2025
Data-dependent Bounds with $T$-Optimal Best-of-Both-Worlds Guarantees in Multi-Armed Bandits using Stability-Penalty Matching

Quan Nguyen, Shinji Ito, Junpei Komiyama et al.

Existing data-dependent and best-of-both-worlds regret bounds for multi-armed bandits problems have limited adaptivity as they are either data-dependent but not best-of-both-worlds (BOBW), BOBW but not data-dependent or have sub-optimal $O(\sqrt{T\ln{T}})$ worst-case guarantee in the adversarial regime. To overcome these limitations, we propose real-time stability-penalty matching (SPM), a new method for obtaining regret bounds that are simultaneously data-dependent, best-of-both-worlds and $T$-optimal for multi-armed bandits problems. In particular, we show that real-time SPM obtains bounds with worst-case guarantees of order $O(\sqrt{T})$ in the adversarial regime and $O(\ln{T})$ in the stochastic regime while simultaneously being adaptive to data-dependent quantities such as sparsity, variations, and small losses. Our results are obtained by extending the SPM technique for tuning the learning rates in the follow-the-regularized-leader (FTRL) framework, which further indicates that the combination of SPM and FTRL is a promising approach for proving new adaptive bounds in online learning problems.

LGMay 1, 2023
Cross-Institutional Transfer Learning for Educational Models: Implications for Model Performance, Fairness, and Equity

Josh Gardner, Renzhe Yu, Quan Nguyen et al.

Modern machine learning increasingly supports paradigms that are multi-institutional (using data from multiple institutions during training) or cross-institutional (using models from multiple institutions for inference), but the empirical effects of these paradigms are not well understood. This study investigates cross-institutional learning via an empirical case study in higher education. We propose a framework and metrics for assessing the utility and fairness of student dropout prediction models that are transferred across institutions. We examine the feasibility of cross-institutional transfer under real-world data- and model-sharing constraints, quantifying model biases for intersectional student identities, characterizing potential disparate impact due to these biases, and investigating the impact of various cross-institutional ensembling approaches on fairness and overall model performance. We perform this analysis on data representing over 200,000 enrolled students annually from four universities without sharing training data between institutions. We find that a simple zero-shot cross-institutional transfer procedure can achieve similar performance to locally-trained models for all institutions in our study, without sacrificing model fairness. We also find that stacked ensembling provides no additional benefits to overall performance or fairness compared to either a local model or the zero-shot transfer procedure we tested. We find no evidence of a fairness-accuracy tradeoff across dozens of models and transfer schemes evaluated. Our auditing procedure also highlights the importance of intersectional fairness analysis, revealing performance disparities at the intersection of sensitive identity groups that are concealed under one-dimensional analysis.

LGFeb 8, 2022
Nonmyopic Multiclass Active Search with Diminishing Returns for Diverse Discovery

Quan Nguyen, Roman Garnett

Active search is a setting in adaptive experimental design where we aim to uncover members of rare, valuable class(es) subject to a budget constraint. An important consideration in this problem is diversity among the discovered targets -- in many applications, diverse discoveries offer more insight and may be preferable in downstream tasks. However, most existing active search policies either assume that all targets belong to a common positive class or encourage diversity via simple heuristics. We present a novel formulation of active search with multiple target classes, characterized by a utility function chosen from a flexible family whose members encourage diversity via a diminishing returns mechanism. We then study this problem under the Bayesian lens and prove a hardness result for approximating the optimal policy for arbitrary positive, increasing, and concave utility functions. Finally, we design an efficient, nonmyopic approximation to the optimal policy for this class of utilities and demonstrate its superior empirical performance in a variety of settings, including drug discovery.

ROOct 13, 2021
Contact-timing and Trajectory Optimization for 3D Jumping on Quadruped Robots

Chuong Nguyen, Quan Nguyen

Performing highly agile acrobatic motions with a long flight phase requires perfect timing, high accuracy, and coordination of the full-body motion. To address these challenges, we present a novel approach on timings and trajectory optimization framework for legged robots performing aggressive 3D jumping. In our method, we firstly utilize an effective optimization framework using simplified rigid body dynamics to solve for contact timings and a reference trajectory of the robot body. The solution of this module is then used to formulate a full-body trajectory optimization based on the full nonlinear dynamics of the robot. This combination allows us to effectively optimize for contact timings while ensuring that the jumping trajectory can be effectively realized in the robot hardware. We first validate the efficiency of the proposed framework on the A1 robot model for various 3D jumping tasks such as double-backflips off the high altitude of 2m. Experimental validation was then successfully conducted for various aggressive 3D jumping motions such as diagonal jumps, barrel roll, and double barrel roll from a box of heights 0.4m and 0.9m, respectively.

ROSep 21, 2021
Balancing Control and Pose Optimization for Wheel-legged Robots Navigating High Obstacles

Junheng Li, Junchao Ma, Quan Nguyen

In this paper, we propose a novel approach on controlling wheel-legged quadrupedal robots using pose optimization and force control via quadratic programming (QP). Our method allows the robot to leverage the whole-body motion and the wheel actuation to roll over high obstacles while keeping the wheel torques to navigate the terrain while keeping the wheel traction and balancing the robot body. In detail, we first present a linear rigid body dynamics with wheels that can be used for real-time balancing control of wheel-legged robots. We then introduce an effective pose optimization method for wheel-legged robot's locomotion over steep ramp and stair terrains. The pose optimization solves for optimal poses to enhance stability and enforce collision-fee constraints for the rolling motion over stair terrain. Experimental validation on the real robot demonstrated the capability of rolling up on a 0.36 m obstacle. The robot can also successfully roll up and down multiple stairs without lifting its legs or having collision with the terrain.

LGJun 11, 2021
Nonmyopic Multifidelity Active Search

Quan Nguyen, Arghavan Modiri, Roman Garnett

Active search is a learning paradigm where we seek to identify as many members of a rare, valuable class as possible given a labeling budget. Previous work on active search has assumed access to a faithful (and expensive) oracle reporting experimental results. However, some settings offer access to cheaper surrogates such as computational simulation that may aid in the search. We propose a model of multifidelity active search, as well as a novel, computationally efficient policy for this setting that is motivated by state-of-the-art classical policies. Our policy is nonmyopic and budget aware, allowing for a dynamic tradeoff between exploration and exploitation. We evaluate the performance of our solution on real-world datasets and demonstrate significantly better performance than natural benchmarks.

ROMar 31, 2021
Force-and-moment-based Model Predictive Control for Achieving Highly Dynamic Locomotion on Bipedal Robots

Junheng Li, Quan Nguyen

In this paper, we propose a novel framework on force-and-moment-based Model Predictive Control (MPC) for dynamic legged robots. Specifically, we present a formulation of MPC designed for 10 degree-of-freedom (DoF) bipedal robots using simplified rigid body dynamics with input forces and moments. This MPC controller will calculate the optimal inputs applied to the robot, including 3-D forces and 2-D moments at each foot. These desired inputs will then be generated by mapping these forces and moments to motor torques of 5 actuators on each leg. We evaluate our proposed control design on physical simulation of a 10 degree-of-freedom (DoF) bipedal robot. The robot can achieve fast walking speed up to 1.6 m/s on rough terrain, with accurate velocity tracking. With the same control framework, our proposed approach can achieve a wide range of dynamic motions including walking, hopping, and running using the same set of control parameters.

ROMar 11, 2021
Robust High-speed Running for Quadruped Robots via Deep Reinforcement Learning

Guillaume Bellegarda, Yiyu Chen, Zhuochen Liu et al.

Deep reinforcement learning has emerged as a popular and powerful way to develop locomotion controllers for quadruped robots. Common approaches have largely focused on learning actions directly in joint space, or learning to modify and offset foot positions produced by trajectory generators. Both approaches typically require careful reward shaping and training for millions of time steps, and with trajectory generators introduce human bias into the resulting control policies. In this paper, we present a learning framework that leads to the natural emergence of fast and robust bounding policies for quadruped robots. The agent both selects and controls actions directly in task space to track desired velocity commands subject to environmental noise including model uncertainty and rough terrain. We observe that this framework improves sample efficiency, necessitates little reward shaping, leads to the emergence of natural gaits such as galloping and bounding, and eases the sim-to-real transfer at running speeds. Policies can be learned in only a few million time steps, even for challenging tasks of running over rough terrain with loads of over 100% of the nominal quadruped mass. Training occurs in PyBullet, and we perform a sim-to-sim transfer to Gazebo and sim-to-real transfer to the Unitree A1 hardware. For sim-to-sim, our results show the quadruped is able to run at over 4 m/s without a load, and 3.5 m/s with a 10 kg load, which is over 83% of the nominal quadruped mass. For sim-to-real, the Unitree A1 is able to bound at 2 m/s with a 5 kg load, representing 42% of the nominal quadruped mass.

RONov 14, 2020
Locomotion and Control of a Friction-Driven Tripedal Robot

Mark Hermes, Taylor McLaughlin, Mitul Luhar et al.

This letter considers control of a radially symmetric tripedal friction-driven robot. The robot features 3 servo motors mounted on a 3-D printed chassis 7 cm from the center of mass and separated 120 degrees. These motors drive limbs, which impart frictional reactive forces on the body. Experimental observations performed on a uniform friction surface validated a mathematical model for robot motion. This model was used to create a gait map, which features instantaneous omni-directional control. We demonstrated line following using live feedback from an overhead tracking camera. Proportional-Integral error compensation performance was compared to a basic position update procedure on a rectangular course. The controller reduced path error by approximately $46\%$. The error compensator is also able to correct for aerodynamic disturbances generated by a high-volume industrial fan with a mean flow speed of $5.5ms^{-1}$, reducing path error by $65\%$ relative to the basic position update procedure.

RONov 13, 2020
Robust Quadruped Jumping via Deep Reinforcement Learning

Guillaume Bellegarda, Chuong Nguyen, Quan Nguyen

In this paper, we consider a general task of jumping varying distances and heights for a quadrupedal robot in noisy environments, such as off of uneven terrain and with variable robot dynamics parameters. To accurately jump in such conditions, we propose a framework using deep reinforcement learning that leverages and augments the complex solution of nonlinear trajectory optimization for quadrupedal jumping. While the standalone optimization limits jumping to take-off from flat ground and requires accurate assumptions of robot dynamics, our proposed approach improves the robustness to allow jumping off of significantly uneven terrain with variable robot dynamical parameters and environmental conditions. Compared with walking and running, the realization of aggressive jumping on hardware necessitates accounting for the motors' torque-speed relationship as well as the robot's total power limits. By incorporating these constraints into our learning framework, we successfully deploy our policy sim-to-real without further tuning, fully exploiting the available onboard power supply and motors. We demonstrate robustness to environment noise of foot disturbances of up to 6 cm in height, or 33% of the robot's nominal standing height, while jumping 2x the body length in distance.

RONov 13, 2020
Safe and Robust Motion Planning for Dynamic Robotics via Control Barrier Functions

Aniketh Manjunath, Quan Nguyen

Control Barrier Functions (CBF) are widely used to enforce the safety-critical constraints on nonlinear systems. Recently, these functions are being incorporated into a path planning framework to design safety-critical path planners. However, these methods fall short of providing a realistic path considering both the algorithm's run-time complexity and enforcement of the safety-critical constraints. This paper proposes a novel motion planning approach using the well-known Rapidly Exploring Random Trees (RRT) algorithm that enforces both CBF and the robot Kinodynamic constraints to generate a safety-critical path. The proposed algorithm also outputs the corresponding control signals that resulted in the obstacle-free path. The approach also allows considering model uncertainties by incorporating the robust CBF constraints into the proposed framework. Thus, the resulting path is free of any obstacles and accounts for the model uncertainty from robot dynamics and perception. Result analysis indicates that the proposed method outperforms various conventional RRT-based path planners, guaranteeing a safety-critical path with minimal computational overhead. We present numerical validation of the algorithm on the Hamster V7 robot car, a micro autonomous Unmanned Ground Vehicle that performs dynamic navigation on an obstacle-ridden path with various uncertainties in perception noises and robot dynamics.

RONov 12, 2020
Adaptive Force-based Control for Legged Robots

Mohsen Sombolestan, Yiyu Chen, Quan Nguyen

Adaptive control can address model uncertainty in control systems. However, it is preliminarily designed for tracking control. Recent advancements in the control of quadruped robots show that force control can effectively realize agile and robust locomotion. In this paper, we present a novel adaptive force-based control framework for legged robots. We introduce a new architecture in our proposed approach to incorporate adaptive control into quadratic programming (QP) force control. Since our approach is based on force control, it also retains the advantages of the baseline framework, such as robustness to uneven terrain, controllable friction constraints, or soft impacts. Our method is successfully validated in both simulation and hardware experiments. While the baseline QP control has shown a significant degradation in the body tracking error with a small load, our proposed adaptive force-based control can enable the 12-kg Unitree A1 robot to walk on rough terrains while carrying a heavy load of up to 6 kg (50% of the robot weight). When standing with four legs, our proposed adaptive control can even allow the robot to carry up to 11 kg of load (92% of the robot weight) with less than 5-cm tracking error in the robot height.

HCOct 16, 2020
Guided Data Discovery in Interactive Visualizations via Active Search

Shayan Monadjemi, Sunwoo Ha, Quan Nguyen et al.

Recent advances in visual analytics have enabled us to learn from user interactions and uncover analytic goals. These innovations set the foundation for actively guiding users during data exploration. Providing such guidance will become more critical as datasets grow in size and complexity, precluding exhaustive investigation. Meanwhile, the machine learning community also struggles with datasets growing in size and complexity, precluding exhaustive labeling. Active learning is a broad family of algorithms developed for actively guiding models during training. We will consider the intersection of these analogous research thrusts. First, we discuss the nuances of matching the choice of an active learning algorithm to the task at hand. This is critical for performance, a fact we demonstrate in a simulation study. We then present results of a user study for the particular task of data discovery guided by an active learning algorithm specifically designed for this task.

SYMay 14, 2020
Robust Safety-Critical Control for Dynamic Robotics

Quan Nguyen, Koushil Sreenath

We present a novel method of optimal robust control through quadratic programs that offers tracking stability while subject to input and state-based constraints as well as safety-critical constraints for nonlinear dynamical robotic systems in the presence of model uncertainty. The proposed method formulates robust control Lyapunov and barrier functions to provide guarantees of stability and safety in the presence of model uncertainty. We evaluate our proposed control design on dynamic walking of a five-link planar bipedal robot subject to contact force constraints as well as safety-critical precise foot placements on stepping stones, all while subject to model uncertainty. We conduct preliminary experimental validation of the proposed controller on a rectilinear spring-cart system under different types of model uncertainty and perturbations.

CRAug 29, 2019
StairDag: Cross-DAG Validation For Scalable BFT Consensus

Quan Nguyen, Andre Cronje, Michael Kong et al.

This paper introduces a new consensus protocol, so-called \emph{\stair}, for fast consensus in DAG-based trustless system. In \stair, we propose a new approach to creating local block DAG, namely \emph{x-DAG} (cross-DAG), on each node. \emph{\stair} protocol is based on our Proof-of-Stake StakeDag framework \cite{stakedag} that distinguishes participants into users and validators by their stake. Both users and validators can create and validate event blocks. Unlike StakeDag's DAG, x-DAG ensures that each new block has to have parent blocks from both Users and Validators to achieve more safety and liveness. Our protocol leverages a pool of validators to expose more validating power to new blocks for faster consensus in a leaderless asynchronous system. Further, our framework allows participants to join as observers / monitors, who can retrieve DAG for post-validation, but do not participate in onchain validation.

DCJul 5, 2019
StakeDag: Stake-based Consensus For Scalable Trustless Systems

Quan Nguyen, Andre Cronje, Michael Kong et al.

Trustless systems, such as those blockchain enpowered, provide trust in the system regardless of the trust of its participants, who may be honest or malicious. Proof-of-stake (PoS) protocols and DAG-based approaches have emerged as a better alternative than the proof of work (PoW) for consensus. This paper introduces a new model, so-called \emph{\stakedag}, which aims for PoS consensus in a DAG-based trustless system. We address a general model of trustless system in which participants are distinguished by their stake or trust: users and validators. Users are normal participants with a no assumed trust and validators are high profile participants with an established trust. We then propose a new family of stake-based consensus protocols $\mathfrak{S}$, operating on the DAG as in the Lachesis protocol~\cite{lachesis01}. Specifically, we propose a stake-based protocol $S_φ$ that leverages participants' stake as validating weights to achieve more secure distributed systems with practical Byzantine fault tolerance (pBFT) in leaderless asynchronous Directed Acyclic Graph (DAG). We then present a general model of staking for asynchronous DAG-based distributed systems.

RODec 4, 2017
Deep Visual Perception for Dynamic Walking on Discrete Terrain

Avinash Siravuru, Allan Wang, Quan Nguyen et al.

Dynamic bipedal walking on discrete terrain, like stepping stones, is a challenging problem requiring feedback controllers to enforce safety-critical constraints. To enforce such constraints in real-world experiments, fast and accurate perception for foothold detection and estimation is needed. In this work, a deep visual perception model is designed to accurately estimate step length of the next step, which serves as input to the feedback controller to enable vision-in-the-loop dynamic walking on discrete terrain. In particular, a custom convolutional neural network architecture is designed and trained to predict step length to the next foothold using a sampled image preview of the upcoming terrain at foot impact. The visual input is offered only at the beginning of each step and is shown to be sufficient for the job of dynamically stepping onto discrete footholds. Through extensive numerical studies, we show that the robot is able to successfully autonomously walk for over 100 steps without failure on a discrete terrain with footholds randomly positioned within a step length range of 45-85 centimeters.

AIFeb 20, 2017
The Dialog State Tracking Challenge with Bayesian Approach

Quan Nguyen

Generative model has been one of the most common approaches for solving the Dialog State Tracking Problem with the capabilities to model the dialog hypotheses in an explicit manner. The most important task in such Bayesian networks models is constructing the most reliable user models by learning and reflecting the training data into the probability distribution of user actions conditional on networks states. This paper provides an overall picture of the learning process in a Bayesian framework with an emphasize on the state-of-the-art theoretical analyses of the Expectation Maximization learning algorithm.