Huy Nguyen

h-index4

4papers

40citations

Novelty57%

AI Score34

Ranked #110,563 of 194,257 authors (top 57%)#3,302 in RO (top 49%)

4 Papers

9.6AIMay 19, 2025Code

CompeteSMoE -- Statistically Guaranteed Mixture of Experts Training via Competition

Nam V. Nguyen, Huy Nguyen, Quang Pham et al.

Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width. However, we argue that effective SMoE training remains challenging because of the suboptimal routing process where experts that perform computation do not directly contribute to the routing process. In this work, we propose competition, a novel mechanism to route tokens to experts with the highest neural response. Theoretically, we show that the competition mechanism enjoys a better sample efficiency than the traditional softmax routing. Furthermore, we develop CompeteSMoE, a simple yet effective algorithm to train large language models by deploying a router to learn the competition policy, thus enjoying strong performances at a low training overhead. Our extensive empirical evaluations on both the visual instruction tuning and language pre-training tasks demonstrate the efficacy, robustness, and scalability of CompeteSMoE compared to state-of-the-art SMoE strategies. We have made the implementation available at: https://github.com/Fsoft-AIC/CompeteSMoE. This work is an improved version of the previous study at arXiv:2402.02526

23.1LGFeb 4, 2024

CompeteSMoE -- Effective Training of Sparse Mixture of Experts via Competition

Quang Pham, Giang Do, Huy Nguyen et al.

Sparse mixture of experts (SMoE) offers an appealing solution to scale up the model complexity beyond the mean of increasing the network's depth or width. However, effective training of SMoE has proven to be challenging due to the representation collapse issue, which causes parameter redundancy and limited representation potentials. In this work, we propose a competition mechanism to address this fundamental challenge of representation collapse. By routing inputs only to experts with the highest neural response, we show that, under mild assumptions, competition enjoys the same convergence rate as the optimal estimator. We further propose CompeteSMoE, an effective and efficient algorithm to train large language models by deploying a simple router that predicts the competition outcomes. Consequently, CompeteSMoE enjoys strong performance gains from the competition routing policy while having low computation overheads. Our extensive empirical evaluations on two transformer architectures and a wide range of tasks demonstrate the efficacy, robustness, and scalability of CompeteSMoE compared to state-of-the-art SMoE strategies.

3.2ROSep 27, 2017

Touch-based object localization in cluttered environments

Huy Nguyen, Quang-Cuong Pham

Touch-based object localization is an important component of autonomous robotic systems that are to perform dexterous tasks in real-world environments. When the objects to locate are placed within clutters, this touch-based procedure tends to generate outlier measurements which, in turn, can lead to a significant loss in localization precision. Our first contribution is to address this problem by applying the RANdom SAmple Consensus (RANSAC) method to a Bayesian estimation framework. As RANSAC requires repeatedly applying the (computationally intensive) Bayesian updating step, it is crucial to improve that step in order to achieve practical running times. Our second contribution is therefore a fast method to find the most probable object face that corresponds to a given touch measurement, which yields a significant acceleration of the Bayesian updating step. Experiments show that our overall algorithm provides accurate localization in practical times, even when the measurements are corrupted by outliers.

5.6ROJun 12, 2017

On the covariance of X in AX = XB

Huy Nguyen, Quang-Cuong Pham

Hand-eye calibration, which consists in identifying the rigid- body transformation between a camera mounted on the robot end-effector and the end-effector itself, is a fundamental problem in robot vision. Mathematically, this problem can be formulated as: solve for X in AX = XB. In this paper, we provide a rigorous derivation of the covariance of the solution X, when A and B are randomly perturbed matrices. This fine-grained information is critical for applications that require a high degree of perception precision. Our approach consists in applying covariance propagation methods in SE(3). Experiments involving synthetic and real calibration data confirm that our approach can predict the covariance of the hand-eye transformation with excellent precision.