Thieu N. Vo

h-index24

6papers

23citations

Novelty53%

AI Score49

Ranked #48,838 of 201,326 authors (top 24%)#11,125 in LG (top 26%)

6 Papers

LGSep 18, 2024

Monomial Matrix Group Equivariant Neural Functional Networks

Viet-Hoang Tran, Thieu N. Vo, Tho H. Tran et al.

Neural functional networks (NFNs) have recently gained significant attention due to their diverse applications, ranging from predicting network generalization and network editing to classifying implicit neural representation. Previous NFN designs often depend on permutation symmetries in neural networks' weights, which traditionally arise from the unordered arrangement of neurons in hidden layers. However, these designs do not take into account the weight scaling symmetries of $\ReLU$ networks, and the weight sign flipping symmetries of $\sin$ or $\Tanh$ networks. In this paper, we extend the study of the group action on the network weights from the group of permutation matrices to the group of monomial matrices by incorporating scaling/sign-flipping symmetries. Particularly, we encode these scaling/sign-flipping symmetries by designing our corresponding equivariant and invariant layers. We name our new family of NFNs the Monomial Matrix Group Equivariant Neural Functional Networks (Monomial-NFN). Because of the expansion of the symmetries, Monomial-NFN has much fewer independent trainable parameters compared to the baseline NFNs in the literature, thus enhancing the model's efficiency. Moreover, for fully connected and convolutional neural networks, we theoretically prove that all groups that leave these networks invariant while acting on their weight spaces are some subgroups of the monomial matrix group. We provide empirical evidence to demonstrate the advantages of our model over existing baselines, achieving competitive performance and efficiency.

HCNov 26, 2025

MMA: A Momentum Mamba Architecture for Human Activity Recognition with Inertial Sensors

Thai-Khanh Nguyen, Uyen Vo, Tan M. Nguyen et al.

Human activity recognition (HAR) from inertial sensors is essential for ubiquitous computing, mobile health, and ambient intelligence. Conventional deep models such as Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and transformers have advanced HAR but remain limited by vanishing or exloding gradients, high computational cost, and difficulty in capturing long-range dependencies. Structured state-space models (SSMs) like Mamba address these challenges with linear complexity and effective temporal modeling, yet they are restricted to first-order dynamics without stable longterm memory mechanisms. We introduce Momentum Mamba, a momentum-augmented SSM that incorporates second-order dynamics to improve stability of information flow across time steps, robustness, and long-sequence modeling. Two extensions further expand its capacity: Complex Momentum Mamba for frequency-selective memory scaling. Experiments on multiple HAR benchmarks demonstrate consistent gains over vanilla Mamba and Transformer baselines in accuracy, robustness, and convergence speed. With only moderate increases in training cost, momentum-augmented SSMs offer a favorable accuracy-efficiency balance, establishing them as a scalable paradigm for HAR and a promising principal framework for broader sequence modeling applications.

CVMay 2, 2022

Design equivariant neural networks for 3D point cloud

Thuan N. A. Trang, Thieu N. Vo, Khuong D. Nguyen

This work seeks to improve the generalization and robustness of existing neural networks for 3D point clouds by inducing group equivariance under general group transformations. The main challenge when designing equivariant models for point clouds is how to trade-off the performance of the model and the complexity. Existing equivariant models are either too complicate to implement or very high complexity. The main aim of this study is to build a general procedure to introduce group equivariant property to SOTA models for 3D point clouds. The group equivariant models built form our procedure are simple to implement, less complexity in comparison with the existing ones, and they preserve the strengths of the original SOTA backbone. From the results of the experiments on object classification, it is shown that our methods are superior to other group equivariant models in performance and complexity. Moreover, our method also helps to improve the mIoU of semantic segmentation models. Overall, by using a combination of only-finite-rotation equivariance and augmentation, our models can outperform existing full $SO(3)$-equivariance models with much cheaper complexity and GPU memory. The proposed procedure is general and forms a fundamental approach to group equivariant neural networks. We believe that it can be easily adapted to other SOTA models in the future.

LGFeb 7, 2024Code

E(3)-Equivariant Mesh Neural Networks

Thuan Trang, Nhat Khang Ngo, Daniel Levy et al.

Triangular meshes are widely used to represent three-dimensional objects. As a result, many recent works have address the need for geometric deep learning on 3D mesh. However, we observe that the complexities in many of these architectures does not translate to practical performance, and simple deep models for geometric graphs are competitive in practice. Motivated by this observation, we minimally extend the update equations of E(n)-Equivariant Graph Neural Networks (EGNNs) (Satorras et al., 2021) to incorporate mesh face information, and further improve it to account for long-range interactions through hierarchy. The resulting architecture, Equivariant Mesh Neural Network (EMNN), outperforms other, more complicated equivariant methods on mesh tasks, with a fast run-time and no expensive pre-processing. Our implementation is available at https://github.com/HySonLab/EquiMesh

49.4LGApr 26

Quasi-Equivariant Metanetworks

Viet-Hoang Tran, An Nguyen, Benoît Guérand et al.

Metanetworks are neural architectures designed to operate directly on pretrained weights to perform downstream tasks. However, the parameter space serves only as a proxy for the underlying function class, and the parameter-function mapping is inherently non-injective: distinct parameter configurations may yield identical input-output behaviors. As a result, metanetworks that rely solely on raw parameters risk overlooking the intrinsic symmetries of the architecture. Reasoning about functional identity is therefore essential for effective metanetwork design, motivating the development of equivariant metanetworks, which incorporate equivariance principles to respect architectural symmetries. Existing approaches, however, typically enforce strict equivariance, which imposes rigid constraints and often leads to sparse and less expressive models. To address this limitation, we introduce the novel concept of quasi-equivariance, which allows metanetworks to move beyond the rigidity of strict equivariance while still preserving functional identity. We lay down a principled basis for this framework and demonstrate its broad applicability across diverse neural architectures, including feedforward, convolutional, and transformer networks. Through empirical evaluation, we show that quasi-equivariant metanetworks achieve good trade-offs between symmetry preservation and representational expressivity. These findings advance the theoretical understanding of weight-space learning and provide a principled foundation for the design of more expressive and functionally robust metanetworks.

LGNov 25, 2025

Dynamical Properties of Tokens in Self-Attention and Effects of Positional Encoding

Duy-Tung Pham, An The Nguyen, Viet-Hoang Tran et al.

This paper investigates the dynamical properties of tokens in pre-trained Transformer models and explores their application to improving Transformers. To this end, we analyze the dynamical system governing the continuous-time limit of the pre-trained model and characterize the asymptotic behavior of its solutions. Specifically, we characterize when tokens move closer to or farther from one another over time, depending on the model parameters. We provide sufficient conditions, based on these parameters, to identify scenarios where tokens either converge to zero or diverge to infinity. Unlike prior works, our conditions are broader in scope and more applicable to real-world models. Furthermore, we investigate how different forms of positional encoding -- specifically absolute and rotary -- affect these dynamical regimes. Empirical evidence reveals that the convergence scenario adversely impacts model performance. Motivated by these insights, we propose simple refinements to Transformer architectures that mitigate convergence behavior in models with absolute or rotary positional encoding. These findings support theoretical foundations and design principles for improving Transformer models.