Hyundong Shin

IT
h-index74
22papers
147citations
Novelty41%
AI Score51

22 Papers

CVJun 3, 2023
Segment Anything Meets Semantic Communication

Shehbaz Tariq, Brian Estadimas Arfeto, Chaoning Zhang et al.

In light of the diminishing returns of traditional methods for enhancing transmission rates, the domain of semantic communication presents promising new frontiers. Focusing on image transmission, this paper explores the application of foundation models, particularly the Segment Anything Model (SAM) developed by Meta AI Research, to improve semantic communication. SAM is a promptable image segmentation model that has gained attention for its ability to perform zero-shot segmentation tasks without explicit training or domain-specific knowledge. By employing SAM's segmentation capability and lightweight neural network architecture for semantic coding, we propose a practical approach to semantic communication. We demonstrate that this approach retains critical semantic features, achieving higher image reconstruction quality and reducing communication overhead. This practical solution eliminates the resource-intensive stage of training a segmentation model and can be applied to any semantic coding architecture, paving the way for real-world applications.

SPMay 26
Geometry-Structured Channel Reconstruction for Conventional and Fluid Antenna Systems: Bayesian Inference and Fundamental Limits

Zhentian Zhang, Kai-Kit Wong, Kaitao Meng et al.

Accurate channel state information (CSI) acquisition is critical for exploiting the spatial flexibility of fluid antenna systems (FASs). However, port selection and transmission optimization require CSI over a large number of candidate port positions, making direct port-wise estimation prohibitively costly in terms of pilot overhead. This paper addresses this challenge through geometry-structured channel reconstruction, which exploits the fact that the port-domain CSI can be parameterized by a small number of dominant propagation paths. We first establish fundamental mean square error (MSE) and normalized MSE (NMSE) benchmarks for both geometry-structured and unstructured channel reconstruction, providing analytical references for evaluating the intrinsic benefit of geometric modeling in conventional antenna systems and FASs. Motivated by the strong spatial correlation induced by densely distributed fluid antenna ports, we further propose a Bayesian reconstruction framework, termed geometry-structured expectation-maximization approximate message passing (GS-EM-AMP). The proposed algorithm incorporates geometric channel structure into the EM-AMP procedure and adaptively learns unknown statistical parameters from noisy observations. Numerical results demonstrate that GS-EM-AMP achieves near-bound reconstruction accuracy while maintaining strong robustness against steering-domain correlation, thereby offering an efficient and reliable solution for large-scale CSI acquisition in FASs.

CVJun 3, 2023
Understanding Segment Anything Model: SAM is Biased Towards Texture Rather than Shape

Chaoning Zhang, Yu Qiao, Shehbaz Tariq et al.

In contrast to the human vision that mainly depends on the shape for recognizing the objects, deep image recognition models are widely known to be biased toward texture. Recently, Meta research team has released the first foundation model for image segmentation, termed segment anything model (SAM), which has attracted significant attention. In this work, we understand SAM from the perspective of texture \textit{v.s.} shape. Different from label-oriented recognition tasks, the SAM is trained to predict a mask for covering the object shape based on a promt. With this said, it seems self-evident that the SAM is biased towards shape. In this work, however, we reveal an interesting finding: the SAM is strongly biased towards texture-like dense features rather than shape. This intriguing finding is supported by a novel setup where we disentangle texture and shape cues and design texture-shape cue conflict for mask prediction.

ITApr 17
Beyond Covariance: Generative Spatial Correlation Modeling and Channel Interpolation for Fluid Antenna Systems

Zhentian Zhang, Hao Jiang, Kai-Kit Wong et al.

Fluid antenna systems (FAS) enable unprecedented spatial diversity within a compact form factor by flexibly switching among high-density antenna ports. To activate this capability, channel state information (CSI) over the ports is required, which implies high estimation overhead because the number of ports is usually very large. Conventional estimation schemes tend to first estimate the CSI for a small number of ports and then infer the CSI for the remaining antenna ports by interpolation exploiting correlation characteristics. However, existing correlation-based techniques lack generalization ability, and the fundamental limits of interpolating the CSI from sparse observations remain poorly understood. This paper adopts a generative modeling framework for characterizing the channel correlation among the FAS ports that departs fundamentally from covariance-descriptive models. Specifically, we represent the spatially sampled channel as a $p$th-order autoregressive (AR) Gauss-Markov process, which provides a principled and tunable tradeoff between model complexity and approximation accuracy via the AR order. In so doing, we can characterize the limits of channel interpolation by deriving the globally optimal minimum mean-square error (MMSE) estimator and establishing a tight lower bound on the minimum number of observations required to meet a prescribed reconstruction error. To reduce the complexity of MMSE estimation, we then exploit the state-space structure due to the ${\rm AR}(p)$ model and develop a Kalman filtering/smoothing-based interpolation algorithm. The resulting method attains the optimal MMSE performance with strictly linear complexity $\mathcal{O}(N)$ with $N$ denoting the number of ports, resulting in a scalable, efficient, and theoretically grounded framework for practical FAS channel reconstruction.

ITMar 26
Enormous Fluid Antenna Systems (E-FAS) under Correlated Surface-Wave Leakage: Physical Layer Security

Farshad Rostami Ghadi, Kai-Kit Wong, Masoud Kaveh et al.

Enormous fluid antenna systems (E-FAS) have recently emerged as a surface-wave (SW)-enabled architecture that can induce controllable large-scale channel gains through guided electromagnetic routing. This paper develops a secrecy analysis framework for E-FAS-assisted downlink transmission with practical pilot-based channel estimation. We consider a multiple-input single-output (MISO) wiretap setting in which the base station (BS) performs minimum mean-square-error (MMSE) channel estimation and adopts maximum-ratio transmission (MRT) with artificial noise (AN). To capture the leakage of SW routing in EFAS, we introduce a correlated SW-leakage model that accounts for statistical coupling between the legitimate and eavesdropper channels caused by partially overlapping SW propagation paths. Exploiting the two-timescale nature-with slowly varying routing gain and small-scale block fading, we then derive a closed-form conditional expression for the secrecy outage probability (SOP) and a tractable characterization of the ergodic secrecy rate (ESR) in the presence of correlated quadratic forms. Our analysis yields three key insights: (i) secrecy collapses at high transmit power if and only if AN is not present, whereas any strictly positive AN can prevent asymptotic collapse; (ii) the optimal data-AN power split is achieved by a strictly interior solution; and (iii) routing gain improves both the received signal strength and the channelestimation quality, creating a nonlinear coupling that raises the signal-to-interference plus noise ratio (SINR) ceiling in the high signal-to-noise ratio (SNR) regime, and disperses secrecy across routing states. Numerical results indicate that E-FAS markedly enlarges the secure operating region significantly when compared with conventional space-wave transmission.

ITMay 21
Finite-Aperture Planar Fluid Antenna Array

Zhentian Zhang, Jingyuan Xu, Kai-Kit Wong et al.

Fluid antenna systems (FASs) are emerging as a reconfigurable-aperture technology that expands physical-layer design beyond fixed, rigid antenna geometries. While the \emph{fading diversity} of FASs -- which exploits spatial channel fluctuations for signal enhancement and interference avoidance -- has been widely studied, the \emph{geometry diversity} created by reconfigurable port placement remains far less understood, particularly for planar architectures under finite-aperture constraints. This paper develops a systematic analytical framework for finite-aperture planar fluid antenna arrays (FAAs). First, we derive a closed-form characterization of the minimum inter-port distance under uniform random placement over a rectangular aperture and show that it follows a Rayleigh law. Its mean scales as $\mathcal{O}(M^{-1})$, in sharp contrast to the $\mathcal{O}(M^{-2})$ behavior in the linear case in which $M$ represents the number of candidate ports, revealing a fundamentally more favorable packing geometry in two dimensions. Secondly, we establish a universal Cramér-Rao bound (CRB) for joint elevation-azimuth estimation, governed by a $2\times 2$ \emph{geometric inertia matrix} whose determinant and eigenstructure fully capture the role of port placement in estimation precision. We further prove that both the trace and determinant of this matrix are invariant to the azimuth look direction. Third, we uncover an intrinsic \emph{precision--ambiguity trade-off}: maximizing the geometric determinant to minimize the CRB drives ports toward the aperture boundary, but simultaneously increases sidelobe-induced spatial ambiguity.

ITMay 21
Fluid RIS (FRIS)-Assisted Index Modulation for 6G Wireless Communications

Xusheng Zhu, Kai-Kit Wong, Sai Xu et al.

Fluid reconfigurable intelligent surfaces (FRIS) extend conventional reconfigurable intelligent surfaces (RIS) by adding spatial reconfigurability through switchable apertures, pattern-reconfigurable units, fluidic conductive materials, or movable surface elements. This article studies how FRIS can support index modulation (IM), where information bits select a surface configuration and the receiver detects the index from the induced receiver-side response. A key challenge is that many feasible FRIS layouts do not necessarily lead to many reliable spatial indices. After propagation, mutual coupling, hardware distortion, and receiver observation, different layouts may produce similar receiver-side responses and cause index-detection errors. To address this issue, we present a response-aware design view, in which FRIS spatial codebooks are selected according to response-domain separability rather than layout diversity alone. We also discuss actuation granularity as a practical design knob that balances spatial diversity, pilot overhead, coupling robustness, and hardware feasibility. The resulting workflow helps select compact, trainable, and controllable spatial-index codebooks from dense FRIS layouts, providing design guidance for future programmable wireless environments.

SYApr 19
WirelessAgent: A Unified Agent Design for General Wireless Resource Allocation Problem without Current Channel State Information

Ran Yi, Ruopeng Xu, Dongshu Zhao et al.

This paper investigates the agent design for solving the wireless resource allocation problem without sufficient channel state information (CSI), which cannot be effectively solved via conventional method. In the considered wireless agent design, we provide the general sense-repair-decide-act workflow, which can be used to intelligently solve general wireless resource allocation problem. A multi-objective optimization problem is formulated to adaptively satisfy different user requirements including both spectrum and energy efficiency. This work addresses the challenge of incomplete CSI for multiple optimization objectives. To solve this problem, we use an artificial intelligence (AI) model to predict missing channel data and construct an agent on the Coze platform, allowing the network operators to optimize multiple objectives through natural language conversations. To tackle the resource scheduling under different objectives, we develop adaptive algorithms. Simulation results validate the effectiveness of our proposed design, demonstrating that the proposed AI method reduces the root mean square error by approximately up to 67\% compared to the traditional approach. Moreover, the data-driven scheduling balances system performance compared to conventional baseline approaches.

SIDec 29, 2025
Quantum Intelligence Meets BD-RIS-Enabled AmBC: Challenges, Opportunities, and Practical Insights

Abd Ullah Khan, Uman Khalid, Trung Q. Duong et al.

A beyond-diagonal reconfigurable intelligent surface (BD-RIS) is an innovative type of reconfigurable intelligent surface (RIS) that has recently been proposed and is considered a revolutionary advancement in wave manipulation. Unlike the mutually disconnected arrangement of elements in traditional RISs, BD-RIS creates cost-effective and simple inter-element connections, allowing for greater freedom in configuring the amplitude and phase of impinging waves. However, there are numerous underlying challenges in realizing the advantages associated with BD-RIS, prompting the research community to actively investigate cutting-edge schemes and algorithms in this direction. Particularly, the passive beamforming design for BD-RIS under specific environmental conditions has become a major focus in this research area. In this article, we provide a systematic introduction to BD-RIS, elaborating on its functional principles concerning architectural design, promising advantages, and classification. Subsequently, we present recent advances and identify a series of challenges and opportunities. Additionally, we consider a specific case study where beamforming is designed using four different algorithms, and we analyze their performance with respect to sum rate and computation cost. To augment the beamforming capabilities in 6G BD-RIS with quantum enhancement, we analyze various hybrid quantum-classical machine learning (ML) models to improve beam prediction performance, employing real-world communication Scenario 8 from the DeepSense 6G dataset. Consequently, we derive useful insights about the practical implications of BD-RIS.

NIDec 25, 2025
Multiconnectivity for SAGIN: Current Trends, Challenges, AI-driven Solutions, and Opportunities

Abd Ullah Khan, Adnan Shahid, Haejoon Jung et al.

Space-air-ground-integrated network (SAGIN)-enabled multiconnectivity (MC) is emerging as a key enabler for next-generation networks, enabling users to simultaneously utilize multiple links across multi-layer non-terrestrial networks (NTN) and multi-radio access technology (multi-RAT) terrestrial networks (TN). However, the heterogeneity of TN and NTN introduces complex architectural challenges that complicate MC implementation. Specifically, the diversity of link types, spanning air-to-air, air-to-space, space-to-space, space-to-ground, and ground-to-ground communications, renders optimal resource allocation highly complex. Recent advancements in reinforcement learning (RL) and agentic artificial intelligence (AI) have shown remarkable effectiveness in optimal decision-making in complex and dynamic environments. In this paper, we review the current developments in SAGIN-enabled MC and outline the key challenges associated with its implementation. We further highlight the transformative potential of AI-driven approaches for resource optimization in a heterogeneous SAGIN environment. To this end, we present a case study on resource allocation optimization enabled by agentic RL for SAGIN-enabled MC involving diverse radio access technologies (RATs). Results show that learning-based methods can effectively handle complex scenarios and substantially enhance network performance in terms of latency and capacity while incurring a moderate increase in power consumption as an acceptable tradeoff. Finally, open research problems and future directions are presented to realize efficient SAGIN-enabled MC.

CVJan 5
Adaptive Hybrid Optimizer based Framework for Lumpy Skin Disease Identification

Ubaidullah, Muhammad Abid Hussain, Mohsin Raza Jafri et al.

Lumpy Skin Disease (LSD) is a contagious viral infection that significantly deteriorates livestock health, thereby posing a serious threat to the global economy and food security. Owing to its rapid spread characteristics, early and precise identification is crucial to prevent outbreaks and ensure timely intervention. In this paper, we propose a hybrid deep learning-based approach called LUMPNet for the early detection of LSD. LUMPNet utilizes image data to detect and classify skin nodules -- the primary indicator of LSD. To this end, LUMPNet uses YOLOv11, EfficientNet-based CNN classifier with compound scaling, and a novel adaptive hybrid optimizer. More precisely, LUMPNet detects and localizes LSD skin nodules and lesions on cattle images. It exploits EfficientNet to classify the localized cattle images into LSD-affected or healthy categories. To stabilize and accelerate the training of YOLOv11 and EfficientNet hybrid model, a novel adaptive hybrid optimizer is proposed and utilized. We evaluate LUMPNet at various stages of LSD using a publicly available dataset. Results indicate that the proposed scheme achieves 99% LSD detection training accuracy, and outperforms existing schemes. The model also achieves validation accuracy of 98%. Moreover, for further evaluation, we conduct a case study using an optimized EfficientNet-B0 model trained with the AdamW optimizer, and compare its performance with LUMPNet. The results show that LUMPNet achieves superior performance.

LGMar 3
Joint Optimization of Model Partitioning and Resource Allocation for Anti-Jamming Collaborative Inference Systems

Mengru Wu, Jiawei Li, Jiaqi Wei et al.

With the increasing computational demands of deep neural network (DNN) inference on resource-constrained devices, DNN partitioning-based device-edge collaborative inference has emerged as a promising paradigm. However, the transmission of intermediate feature data is vulnerable to malicious jamming, which significantly degrades the overall inference performance. To counter this threat, this letter focuses on an anti-jamming collaborative inference system in the presence of a malicious jammer. In this system, a DNN model is partitioned into two distinct segments, which are executed by wireless devices and edge servers, respectively. We first analyze the effects of jamming and DNN partitioning on inference accuracy via data regression. Based on this, our objective is to maximize the system's revenue of delay and accuracy (RDA) under inference accuracy and computing resource constraints by jointly optimizing computation resource allocation, devices' transmit power, and DNN partitioning. To address the mixed-integer nonlinear programming problem, we propose an efficient alternating optimization-based algorithm, which decomposes the problem into three subproblems that are solved via Karush-Kuhn-Tucker conditions, convex optimization methods, and a quantum genetic algorithm, respectively. Extensive simulations demonstrate that our proposed scheme outperforms baselines in terms of RDA.

ITMay 7
Fluid Antenna Systems Enabling 6G HRLLC With Port Switching Delay

Xusheng Zhu, Kai-Kit Wong, Hao Xu et al.

Fluid antenna systems (FAS) exploit antenna position reconfigurability to unlock massive spatial diversity within compact form factors, making them a promising enabler for 6G user terminals (UTs). However, practical port switching incurs latency and signaling overhead, which can be particularly detrimental to hyper-reliable low-latency communications (HRLLC) under finite blocklength operation. This paper investigates FASenabled HRLLC by explicitly capturing the coupled effects of spatial correlation, port switching delay, and finite blocklength coding. We derive exact closed-form expressions for the average block error rate (BLER) and average achievable rate over spatially correlated fading channels. The resulting analysis reveals a fundamental design trade-off: increasing the number of ports improves diversity but linearly reduces the effective blocklength, thereby intensifying finite-blocklength penalties. A key theoretical contribution is a rigorous proof that reliability, achievable rate, and energy efficiency are strictly unimodal in the port dimension, ensuring a unique optimal port configuration. Furthermore, we characterize an explicit switching-delay threshold that separates regimes where FAS yields net gains over fixed-position antenna (FPA) systems. Numerical results validate the analysis and show that substantial HRLLC performance gains are achievable when the switching latency remains below the derived bound.

ITMay 6
Phased Ultra Massive Array (PUMA)

Hanjiang Hong, Kai-Kit Wong, Xusheng Zhu et al.

This paper proposes a novel multiple-access framework, termed the phased ultra massive antenna array (PUMA), which exploits the distinctive spatial flexibility of fluid antenna systems (FAS) at the user equipment (UE). Building upon fluid antenna multiple access (FAMA) and compact ultra-massive antenna array (CUMA), PUMA incorporates a phased array for signal aggregation. This architecture enables the UE to inherently mitigate co-user interference within the spatial domain without necessitating channel state information (CSI) for precoding at the base station (BS) or complex interference cancellation at each UE. A primary advantage of PUMA lies in its hardware efficiency: by implementing phase shifting and signal combining in the analog domain, it achieves high antenna gain while requiring only a minimal number of radio-frequency (RF) chains, potentially a single RF chain. Comprehensive theoretical analysis of the achievable data rate is provided, complemented by extensive simulations that validate the framework. The results demonstrate that PUMA markedly outperforms FAMA and CUMA architectures, particularly for UEs with a single RF chain, offering a robust and scalable solution for interference-insensitive massive connectivity in sixth-generation (6G) systems.

SPJan 29, 2024
Extreme Learning Machine-based Channel Estimation in IRS-Assisted Multi-User ISAC System

Yu Liu, Ibrahim Al-Nahhal, Octavia A. Dobre et al.

Multi-user integrated sensing and communication (ISAC) assisted by intelligent reflecting surface (IRS) has been recently investigated to provide a high spectral and energy efficiency transmission. This paper proposes a practical channel estimation approach for the first time to an IRS-assisted multiuser ISAC system. The estimation problem in such a system is challenging since the sensing and communication (SAC) signals interfere with each other, and the passive IRS lacks signal processing ability. A two-stage approach is proposed to transfer the overall estimation problem into sub-ones, successively including the direct and reflected channels estimation. Based on this scheme, the ISAC base station (BS) estimates all the SAC channels associated with the target and uplink users, while each downlink user estimates the downlink communication channels individually. Considering a low-cost demand of the ISAC BS and downlink users, the proposed two-stage approach is realized by an efficient neural network (NN) framework that contains two different extreme learning machine (ELM) structures to estimate the above SAC channels. Moreover, two types of input-output pairs to train the ELMs are carefully devised, which impact the estimation accuracy and computational complexity under different system parameters. Simulation results reveal a substantial performance improvement achieved by the proposed ELM-based approach over the least-squares and NN-based benchmarks, with reduced training complexity and faster training speed.

ITJan 26
Finite-Aperture Fluid Antenna Array Design: Analysis and Algorithm

Zhentian Zhang, Kai-Kit Wong, Hao Jiang et al.

Finite-aperture constraints render array design nontrivial and can undermine the effectiveness of classical sparse geometries. This letter provides universal guidance for fluid antenna array (FAA) design under a fixed aperture. We derive a closed-form Cramér--Rao bound (CRB) that unifies conventional and reconfigurable arrays by explicitly linking the Fisher information to the geometric variance of port locations. We further obtain a closed-form probability density function of the minimum spacing under random FAA placement, which yields a principled lower bound for the minimum-spacing constraint. Building upon these analytical insights, we then propose a gradient-based algorithm to optimize continuous port locations. Utilizing a simple gradient update design, the optimized FAA can achieve about a $30\%$ CRB reduction and a $42.5\%$ reduction in mean-squared error.

AIApr 25
CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

Shuxu Chen, Yitian Zhou, Jiaquan Zhang et al.

Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on long, multi-step problems, leading to inconsistent answers for unchanged task. Most prior work focuses on improving the forward reasoning chain within a single pass, with less attention to iterative and contrastive correction. To address this gap, we propose CAP-CoT, a Cycle Adversarial Prompt optimization framework designed to improve both CoT reasoning accuracy and stability of a single deployed solver. In each cycle, a forward solver generates candidate reasoning chains, an adversarial challenger constructs plausible but deliberately flawed chains using targeted error strategies, and a feedback agent contrasts the two chains and produces step-aligned structured feedback. This feedback closes the optimization loop in two directions, including updating the solver prompt based on errors exposed by the challenger, and updating the challenger prompt to generate increasingly targeted errors in subsequent cycles. Unlike safety-oriented adversarial prompting such as jailbreak or prompt-injection attacks, our adversarial component is task-semantic and aims to expose logical vulnerabilities in reasoning chains. Experiments across six benchmarks and four LLM backbones demonstrate that within two to three adversarial prompt optimization cycles, CAP-CoT consistently reduces variability across runs while improving reasoning accuracy and robustness to prompt perturbations.

DCApr 24
Network Edge Inference for Large Language Models: Principles, Techniques, and Opportunities

Zhixiong Chen, Bingjie Zhu, Jiangzhou Wang et al.

Large language models (LLMs) have advanced rapidly, emerging as versatile tools across fields thanks to their exceptional language understanding, generation, and reasoning capabilities. However, performing LLM inference at the network edge remains challenging due to their large memory and compute demands. This survey outlines the challenges specific to LLM edge inference and provides a comprehensive overview of recent progress, covering system architectures, model optimization and deployment, and resource management and scheduling. By synthesizing state-of-the-art techniques and mapping future directions, this survey aims to unlock the potential of LLMs in resource-constrained edge environments.

AIJan 17, 2025
GenSC-6G: A Prototype Testbed for Integrated Generative AI, Quantum, and Semantic Communication

Brian E. Arfeto, Shehbaz Tariq, Uman Khalid et al.

We introduce a prototyping testbed, GenSC-6G, developed to generate a comprehensive dataset that supports the integration of generative artificial intelligence (AI), quantum computing, and semantic communication for emerging sixth-generation (6G) applications. The GenSC-6G dataset is designed with noise-augmented synthetic data optimized for semantic decoding, classification, and localization tasks, significantly enhancing flexibility for diverse AI-driven communication applications. This adaptable prototype supports seamless modifications across baseline models, communication modules, and goal-oriented decoders. Case studies demonstrate its application in lightweight classification, semantic upsampling, and edge-based language inference under noise conditions. The GenSC-6G dataset serves as a scalable and robust resource for developing goal-oriented communication systems tailored to the growing demands of 6G networks.

SPMar 2
Orchestrating Multimodal DNN Workloads in Wireless Neural Processing

Sai Xu, Kai-Kit Wong, Yanan Du et al.

In edge inference, wireless resource allocation and accelerator-level deep neural network (DNN) scheduling have yet to be co-optimized in an end-to-end manner. The lack of coordination between wireless transmission and accelerator-level DNN execution prevents efficient overlap, leading to higher end-to-end inference latency. To address this issue, this paper investigates multimodal DNN workload orchestration in wireless neural processing (WNP), a paradigm that integrates wireless transmission and multi-core accelerator execution into a unified end-to-end pipeline. First, we develop a unified communication-computation model for multimodal DNN execution and formulate the corresponding optimization problem. Second, we propose O-WiN, a framework that orchestrates DNN workloads in WNP through two tightly coupled stages: simulation-based optimization and runtime execution. Third, we develop two algorithms, RTFS and PACS. RTFS schedules communication and computation sequentially, whereas PACS interleaves them to enable pipeline parallelism by overlapping wireless data transfer with accelerator-level DNN execution. Simulation results demonstrate that PACS significantly outperforms RTFS under high modality heterogeneity by better masking wireless latency through communication-computation overlap, thereby highlighting the effectiveness of communication-computation pipelining in accelerating multimodal DNN execution in WNP.

SYSep 17, 2025
Large Language Model-Empowered Decision Transformer for UAV-Enabled Data Collection

Zhixion Chen, Jiangzhou Wang, Hyundong Shin et al.

The deployment of unmanned aerial vehicles (UAVs) for reliable and energy-efficient data collection from spatially distributed devices holds great promise in supporting diverse Internet of Things (IoT) applications. Nevertheless, the limited endurance and communication range of UAVs necessitate intelligent trajectory planning. While reinforcement learning (RL) has been extensively explored for UAV trajectory optimization, its interactive nature entails high costs and risks in real-world environments. Offline RL mitigates these issues but remains susceptible to unstable training and heavily rely on expert-quality datasets. To address these challenges, we formulate a joint UAV trajectory planning and resource allocation problem to maximize energy efficiency of data collection. The resource allocation subproblem is first transformed into an equivalent linear programming formulation and solved optimally with polynomial-time complexity. Then, we propose a large language model (LLM)-empowered critic-regularized decision transformer (DT) framework, termed LLM-CRDT, to learn effective UAV control policies. In LLM-CRDT, we incorporate critic networks to regularize the DT model training, thereby integrating the sequence modeling capabilities of DT with critic-based value guidance to enable learning effective policies from suboptimal datasets. Furthermore, to mitigate the data-hungry nature of transformer models, we employ a pre-trained LLM as the transformer backbone of the DT model and adopt a parameter-efficient fine-tuning strategy, i.e., LoRA, enabling rapid adaptation to UAV control tasks with small-scale dataset and low computational overhead. Extensive simulations demonstrate that LLM-CRDT outperforms benchmark online and offline RL methods, achieving up to 36.7\% higher energy efficiency than the current state-of-the-art DT approaches.

ITSep 24, 2019
Power Allocation in Cache-Aided NOMA Systems: Optimization and Deep Reinforcement Learning Approaches

Khai Nguyen Doan, Mojtaba Vaezi, Wonjae Shin et al.

This work exploits the advantages of two prominent techniques in future communication networks, namely caching and non-orthogonal multiple access (NOMA). Particularly, a system with Rayleigh fading channels and cache-enabled users is analyzed. It is shown that the caching-NOMA combination provides a new opportunity of cache hit which enhances the cache utility as well as the effectiveness of NOMA. Importantly, this comes without requiring users' collaboration, and thus, avoids many complicated issues such as users' privacy and security, selfishness, etc. In order to optimize users' quality of service and, concurrently, ensure the fairness among users, the probability that all users can decode the desired signals is maximized. In NOMA, a combination of multiple messages are sent to users, and the defined objective is approached by finding an appropriate power allocation for message signals. To address the power allocation problem, two novel methods are proposed. The first one is a divide-and-conquer-based method for which closed-form expressions for the optimal resource allocation policy are derived, making this method simple and flexible to the system context. The second one is based on the deep reinforcement learning method that allows all users to share the full bandwidth. Finally, simulation results are provided to demonstrate the effectiveness of the proposed methods and to compare their performance.