ITApr 14
Anchor-Aided Multi-User Semantic Communication with Adaptive DecodersLoc X. Nguyen, Phuong-Nam Tran, Trung Thanh Pham et al.
Semantic communication (SemCom) is accelerating its momentum to catch up with the massive increase in users' demands in both quantity and quality, with the assistance of advanced deep learning (DL) techniques. Specifically, SemCom can actively embed the semantic meaning of the data into the transmission process, while eliminating statistical redundancy to preserve bandwidth resources for other users. Therefore, the transmitter encodes the message in the most concise way, while the receiver tries to interpret the message with the DL model and its knowledge of the transmitter's intended meaning. Most existing works only consider one transmitter and one receiver, which limits their ability to address the diversity in users' models and capabilities. Therefore, in this paper, we propose a multi-user semantic communication system where each user is equipped with a distinct DL-based joint source-channel decoder architecture, reflecting the diversity in computing capacity. The challenging issue with the proposed system is the catastrophic forgetting property of neural networks, where the DL-based encoder fails to encode the data for the previous user when being trained with a new user. To address this, we propose an anchor decoder with an architecture that is symmetric to the encoder. The symmetric decoder has the same computational capacity as the encoder, providing feedback that aligns with the encoder's extraction capabilities and enhances optimization efficiency. The parameters of the optimized encoder are then frozen and used to train decoders for various users, aligning them with the encoder outputs. Finally, we conduct a series of simulation experiments to validate the proposed framework against other benchmarks.
LGMar 13, 2025Code
DeepSeek-Inspired Exploration of RL-based LLMs and Synergy with Wireless Networks: A SurveyYu Qiao, Phuong-Nam Tran, Ji Su Yoon et al.
Reinforcement learning (RL)-based large language models (LLMs), such as ChatGPT, DeepSeek, and Grok-3, have attracted widespread attention for their remarkable capabilities in multimodal data understanding. Meanwhile, the rapid expansion of information services has led to a growing demand for AI-enabled wireless networks. The open-source DeepSeek models are famous for their innovative designs, such as large-scale pure RL and cost-efficient training, which make them well-suited for practical deployment in wireless networks. By integrating DeepSeek-style LLMs with wireless infrastructures, a synergistic opportunity arises: the DeepSeek-style LLMs enhance network optimization with strong reasoning and decision-making abilities, while wireless infrastructure enables the broad deployment of these models. Motivated by this convergence, this survey presents a comprehensive DeepSeek-inspired exploration of RL-based LLMs in the context of wireless networks. We begin by reviewing key techniques behind network optimization to establish a foundation for understanding DeepSeek-style LLM integration. Next, we examine recent advancements in RL-based LLMs, using DeepSeek models as a representative example. Building on this, we explore the synergy between the two domains, highlighting motivations, challenges, and potential solutions. Finally, we highlight emerging directions for integrating LLMs with wireless networks, such as quantum, on-device, and neural-symbolic LLM models, as well as embodied AI agents. Overall, this survey offers a comprehensive examination of the interplay between DeepSeek-style LLMs and wireless networks, demonstrating how these domains can mutually enhance each other to drive innovation.
CVDec 23, 2024Code
QTSeg: A Query Token-Based Dual-Mix Attention Framework with Multi-Level Feature Distribution for Medical Image SegmentationPhuong-Nam Tran, Nhat Truong Pham, Duc Ngoc Minh Dang et al.
Medical image segmentation plays a crucial role in assisting healthcare professionals with accurate diagnoses and enabling automated diagnostic processes. Traditional convolutional neural networks (CNNs) often struggle with capturing long-range dependencies, while transformer-based architectures, despite their effectiveness, come with increased computational complexity. Recent efforts have focused on combining CNNs and transformers to balance performance and efficiency, but existing approaches still face challenges in achieving high segmentation accuracy while maintaining low computational costs. Furthermore, many methods underutilize the CNN encoder's capability to capture local spatial information, concentrating primarily on mitigating long-range dependency issues. To address these limitations, we propose QTSeg, a novel architecture for medical image segmentation that effectively integrates local and global information. QTSeg features a dual-mix attention decoder designed to enhance segmentation performance through: (1) a cross-attention mechanism for improved feature alignment, (2) a spatial attention module to capture long-range dependencies, and (3) a channel attention block to learn inter-channel relationships. Additionally, we introduce a multi-level feature distribution module, which adaptively balances feature propagation between the encoder and decoder, further boosting performance. Extensive experiments on five publicly available datasets covering diverse segmentation tasks, including lesion, polyp, breast cancer, cell, and retinal vessel segmentation, demonstrate that QTSeg outperforms state-of-the-art methods across multiple evaluation metrics while maintaining lower computational costs. Our implementation can be found at: https://github.com/tpnam0901/QTSeg (v1.0.0)
AIMay 11, 2025
Towards Artificial General or Personalized Intelligence? A Survey on Foundation Models for Personalized Federated IntelligenceYu Qiao, Huy Q. Le, Avi Deb Raha et al.
The rise of large language models (LLMs), such as ChatGPT, DeepSeek, and Grok-3, has reshaped the artificial intelligence landscape. As prominent examples of foundational models (FMs) built on LLMs, these models exhibit remarkable capabilities in generating human-like content, bringing us closer to achieving artificial general intelligence (AGI). However, their large-scale nature, sensitivity to privacy concerns, and substantial computational demands present significant challenges to personalized customization for end users. To bridge this gap, this paper presents the vision of artificial personalized intelligence (API), focusing on adapting these powerful models to meet the specific needs and preferences of users while maintaining privacy and efficiency. Specifically, this paper proposes personalized federated intelligence (PFI), which integrates the privacy-preserving advantages of federated learning (FL) with the zero-shot generalization capabilities of FMs, enabling personalized, efficient, and privacy-protective deployment at the edge. We first review recent advances in both FL and FMs, and discuss the potential of leveraging FMs to enhance federated systems. We then present the key motivations behind realizing PFI and explore promising opportunities in this space, including efficient PFI, trustworthy PFI, and PFI empowered by retrieval-augmented generation (RAG). Finally, we outline key challenges and future research directions for deploying FM-powered FL systems at the edge with improved personalization, computational efficiency, and privacy guarantees. Overall, this survey aims to lay the groundwork for the development of API as a complement to AGI, with a particular focus on PFI as a key enabling technique.