Long Tan Le

LG
h-index21
6papers
63citations
Novelty51%
AI Score29

6 Papers

LGDec 23, 2022
Federated PCA on Grassmann Manifold for Anomaly Detection in IoT Networks

Tung-Anh Nguyen, Jiayu He, Long Tan Le et al.

In the era of Internet of Things (IoT), network-wide anomaly detection is a crucial part of monitoring IoT networks due to the inherent security vulnerabilities of most IoT devices. Principal Components Analysis (PCA) has been proposed to separate network traffics into two disjoint subspaces corresponding to normal and malicious behaviors for anomaly detection. However, the privacy concerns and limitations of devices' computing resources compromise the practical effectiveness of PCA. We propose a federated PCA-based Grassmannian optimization framework that coordinates IoT devices to aggregate a joint profile of normal network behaviors for anomaly detection. First, we introduce a privacy-preserving federated PCA framework to simultaneously capture the profile of various IoT devices' traffic. Then, we investigate the alternating direction method of multipliers gradient-based learning on the Grassmann manifold to guarantee fast training and the absence of detecting latency using limited computational resources. Empirical results on the NSL-KDD dataset demonstrate that our method outperforms baseline approaches. Finally, we show that the Grassmann manifold algorithm is highly adapted for IoT anomaly detection, which permits drastically reducing the analysis time of the system. To the best of our knowledge, this is the first federated PCA algorithm for anomaly detection meeting the requirements of IoT networks.

LGJul 10, 2024
Federated PCA on Grassmann Manifold for IoT Anomaly Detection

Tung-Anh Nguyen, Long Tan Le, Tuan Dung Nguyen et al.

With the proliferation of the Internet of Things (IoT) and the rising interconnectedness of devices, network security faces significant challenges, especially from anomalous activities. While traditional machine learning-based intrusion detection systems (ML-IDS) effectively employ supervised learning methods, they possess limitations such as the requirement for labeled data and challenges with high dimensionality. Recent unsupervised ML-IDS approaches such as AutoEncoders and Generative Adversarial Networks (GAN) offer alternative solutions but pose challenges in deployment onto resource-constrained IoT devices and in interpretability. To address these concerns, this paper proposes a novel federated unsupervised anomaly detection framework, FedPCA, that leverages Principal Component Analysis (PCA) and the Alternating Directions Method Multipliers (ADMM) to learn common representations of distributed non-i.i.d. datasets. Building on the FedPCA framework, we propose two algorithms, FEDPE in Euclidean space and FEDPG on Grassmann manifolds. Our approach enables real-time threat detection and mitigation at the device level, enhancing network resilience while ensuring privacy. Moreover, the proposed algorithms are accompanied by theoretical convergence rates even under a subsampling scheme, a novel result. Experimental results on the UNSW-NB15 and TON-IoT datasets show that our proposed methods offer performance in anomaly detection comparable to nonlinear baselines, while providing significant improvements in communication and memory efficiency, underscoring their potential for securing IoT networks.

LGJun 3, 2022
On the Generalization of Wasserstein Robust Federated Learning

Tung-Anh Nguyen, Tuan Dung Nguyen, Long Tan Le et al.

In federated learning, participating clients typically possess non-i.i.d. data, posing a significant challenge to generalization to unseen distributions. To address this, we propose a Wasserstein distributionally robust optimization scheme called WAFL. Leveraging its duality, we frame WAFL as an empirical surrogate risk minimization problem, and solve it using a local SGD-based algorithm with convergence guarantees. We show that the robustness of WAFL is more general than related approaches, and the generalization bound is robust to all adversarial distributions inside the Wasserstein ball (ambiguity set). Since the center location and radius of the Wasserstein ball can be suitably modified, WAFL shows its applicability not only in robustness but also in domain adaptation. Through empirical evaluation, we demonstrate that WAFL generalizes better than the vanilla FedAvg in non-i.i.d. settings, and is more robust than other related methods in distribution shift settings. Further, using benchmark datasets we show that WAFL is capable of generalizing to unseen target domains.

LGSep 27, 2023
Federated Deep Equilibrium Learning: Harnessing Compact Global Representations to Enhance Personalization

Long Tan Le, Tuan Dung Nguyen, Tung-Anh Nguyen et al.

Federated Learning (FL) has emerged as a groundbreaking distributed learning paradigm enabling clients to train a global model collaboratively without exchanging data. Despite enhancing privacy and efficiency in information retrieval and knowledge management contexts, training and deploying FL models confront significant challenges such as communication bottlenecks, data heterogeneity, and memory limitations. To comprehensively address these challenges, we introduce FeDEQ, a novel FL framework that incorporates deep equilibrium learning and consensus optimization to harness compact global data representations for efficient personalization. Specifically, we design a unique model structure featuring an equilibrium layer for global representation extraction, followed by explicit layers tailored for local personalization. We then propose a novel FL algorithm rooted in the alternating directions method of multipliers (ADMM), which enables the joint optimization of a shared equilibrium layer and individual personalized layers across distributed datasets. Our theoretical analysis confirms that FeDEQ converges to a stationary point, achieving both compact global representations and optimal personalized parameters for each client. Extensive experiments on various benchmarks demonstrate that FeDEQ matches the performance of state-of-the-art personalized FL methods, while significantly reducing communication size by up to 4 times and memory footprint by 1.5 times during training.

LGMar 14, 2025
Federated Koopman-Reservoir Learning for Large-Scale Multivariate Time-Series Anomaly Detection

Long Tan Le, Tung-Anh Nguyen, Han Shu et al.

The proliferation of edge devices has dramatically increased the generation of multivariate time-series (MVTS) data, essential for applications from healthcare to smart cities. Such data streams, however, are vulnerable to anomalies that signal crucial problems like system failures or security incidents. Traditional MVTS anomaly detection methods, encompassing statistical and centralized machine learning approaches, struggle with the heterogeneity, variability, and privacy concerns of large-scale, distributed environments. In response, we introduce FedKO, a novel unsupervised Federated Learning framework that leverages the linear predictive capabilities of Koopman operator theory along with the dynamic adaptability of Reservoir Computing. This enables effective spatiotemporal processing and privacy preservation for MVTS data. FedKO is formulated as a bi-level optimization problem, utilizing a specific federated algorithm to explore a shared Reservoir-Koopman model across diverse datasets. Such a model is then deployable on edge devices for efficient detection of anomalies in local MVTS streams. Experimental results across various datasets showcase FedKO's superior performance against state-of-the-art methods in MVTS anomaly detection. Moreover, FedKO reduces up to 8x communication size and 2x memory usage, making it highly suitable for large-scale systems.

NIMay 28, 2025
Distributionally Robust Wireless Semantic Communication with Large AI Models

Long Tan Le, Senura Hansaja Wanasekara, Zerun Niu et al.

Semantic communication (SemCom) has emerged as a promising paradigm for 6G wireless systems by transmitting task-relevant information rather than raw bits, yet existing approaches remain vulnerable to dual sources of uncertainty: semantic misinterpretation arising from imperfect feature extraction and transmission-level perturbations from channel noise. Current deep learning based SemCom systems typically employ domain-specific architectures that lack robustness guarantees and fail to generalize across diverse noise conditions, adversarial attacks, and out-of-distribution data. In this paper, a novel and generalized semantic communication framework called WaSeCom is proposed to systematically address uncertainty and enhance robustness. In particular, Wasserstein distributionally robust optimization is employed to provide resilience against semantic misinterpretation and channel perturbations. A rigorous theoretical analysis is performed to establish the robust generalization guarantees of the proposed framework. Experimental results on image and text transmission demonstrate that WaSeCom achieves improved robustness under noise and adversarial perturbations. These results highlight its effectiveness in preserving semantic fidelity across varying wireless conditions.