Jiangzhou Wang

LG
h-index45
32papers
777citations
Novelty46%
AI Score54

32 Papers

LGJul 11, 2024
Distributed Deep Reinforcement Learning Based Gradient Quantization for Federated Learning Enabled Vehicle Edge Computing

Cui Zhang, Wenjun Zhang, Qiong Wu et al.

Federated Learning (FL) can protect the privacy of the vehicles in vehicle edge computing (VEC) to a certain extent through sharing the gradients of vehicles' local models instead of local data. The gradients of vehicles' local models are usually large for the vehicular artificial intelligence (AI) applications, thus transmitting such large gradients would cause large per-round latency. Gradient quantization has been proposed as one effective approach to reduce the per-round latency in FL enabled VEC through compressing gradients and reducing the number of bits, i.e., the quantization level, to transmit gradients. The selection of quantization level and thresholds determines the quantization error, which further affects the model accuracy and training time. To do so, the total training time and quantization error (QE) become two key metrics for the FL enabled VEC. It is critical to jointly optimize the total training time and QE for the FL enabled VEC. However, the time-varying channel condition causes more challenges to solve this problem. In this paper, we propose a distributed deep reinforcement learning (DRL)-based quantization level allocation scheme to optimize the long-term reward in terms of the total training time and QE. Extensive simulations identify the optimal weighted factors between the total training time and QE, and demonstrate the feasibility and effectiveness of the proposed scheme.

LGJul 9, 2024
Graph Neural Networks and Deep Reinforcement Learning Based Resource Allocation for V2X Communications

Maoxin Ji, Qiong Wu, Pingyi Fan et al.

In the rapidly evolving landscape of Internet of Vehicles (IoV) technology, Cellular Vehicle-to-Everything (C-V2X) communication has attracted much attention due to its superior performance in coverage, latency, and throughput. Resource allocation within C-V2X is crucial for ensuring the transmission of safety information and meeting the stringent requirements for ultra-low latency and high reliability in Vehicle-to-Vehicle (V2V) communication. This paper proposes a method that integrates Graph Neural Networks (GNN) with Deep Reinforcement Learning (DRL) to address this challenge. By constructing a dynamic graph with communication links as nodes and employing the Graph Sample and Aggregation (GraphSAGE) model to adapt to changes in graph structure, the model aims to ensure a high success rate for V2V communication while minimizing interference on Vehicle-to-Infrastructure (V2I) links, thereby ensuring the successful transmission of V2V link information and maintaining high transmission rates for V2I links. The proposed method retains the global feature learning capabilities of GNN and supports distributed network deployment, allowing vehicles to extract low-dimensional features that include structural information from the graph network based on local observations and to make independent resource allocation decisions. Simulation results indicate that the introduction of GNN, with a modest increase in computational load, effectively enhances the decision-making quality of agents, demonstrating superiority to other methods. This study not only provides a theoretically efficient resource allocation strategy for V2V and V2I communications but also paves a new technical path for resource management in practical IoV environments.

DCAug 2, 2022
Mobility-Aware Cooperative Caching in Vehicular Edge Computing Based on Asynchronous Federated and Deep Reinforcement Learning

Qiong Wu, Yu Zhao, Qiang Fan et al.

The vehicular edge computing (VEC) can cache contents in different RSUs at the network edge to support the real-time vehicular applications. In VEC, owing to the high-mobility characteristics of vehicles, it is necessary to cache the user data in advance and learn the most popular and interesting contents for vehicular users. Since user data usually contains privacy information, users are reluctant to share their data with others. To solve this problem, traditional federated learning (FL) needs to update the global model synchronously through aggregating all users' local models to protect users' privacy. However, vehicles may frequently drive out of the coverage area of the VEC before they achieve their local model trainings and thus the local models cannot be uploaded as expected, which would reduce the accuracy of the global model. In addition, the caching capacity of the local RSU is limited and the popular contents are diverse, thus the size of the predicted popular contents usually exceeds the cache capacity of the local RSU. Hence, the VEC should cache the predicted popular contents in different RSUs while considering the content transmission delay. In this paper, we consider the mobility of vehicles and propose a cooperative Caching scheme in the VEC based on Asynchronous Federated and deep Reinforcement learning (CAFR). We first consider the mobility of vehicles and propose an asynchronous FL algorithm to obtain an accurate global model, and then propose an algorithm to predict the popular contents based on the global model. In addition, we consider the mobility of vehicles and propose a deep reinforcement learning algorithm to obtain the optimal cooperative caching location for the predicted popular contents in order to optimize the content transmission delay. Extensive experimental results have demonstrated that the CAFR scheme outperforms other baseline caching schemes.

LGJul 1, 2024
Optimizing Age of Information in Vehicular Edge Computing with Federated Graph Neural Network Multi-Agent Reinforcement Learning

Wenhua Wang, Qiong Wu, Pingyi Fan et al.

With the rapid development of intelligent vehicles and Intelligent Transport Systems (ITS), the sensors such as cameras and LiDAR installed on intelligent vehicles provides higher capacity of executing computation-intensive and delay-sensitive tasks, thereby raising deployment costs. To address this issue, Vehicular Edge Computing (VEC) has been proposed to process data through Road Side Units (RSUs) to support real-time applications. This paper focuses on the Age of Information (AoI) as a key metric for data freshness and explores task offloading issues for vehicles under RSU communication resource constraints. We adopt a Multi-agent Deep Reinforcement Learning (MADRL) approach, allowing vehicles to autonomously make optimal data offloading decisions. However, MADRL poses risks of vehicle information leakage during communication learning and centralized training. To mitigate this, we employ a Federated Learning (FL) framework that shares model parameters instead of raw data to protect the privacy of vehicle users. Building on this, we propose an innovative distributed federated learning framework combining Graph Neural Networks (GNN), named Federated Graph Neural Network Multi-Agent Reinforcement Learning (FGNN-MADRL), to optimize AoI across the system. For the first time, road scenarios are constructed as graph data structures, and a GNN-based federated learning framework is proposed, effectively combining distributed and centralized federated aggregation. Furthermore, we propose a new MADRL algorithm that simplifies decision making and enhances offloading efficiency, further reducing the decision complexity. Simulation results demonstrate the superiority of our proposed approach to other methods through simulations.

81.9SYApr 30
Cooperative ISAC for LAE: Joint Trajectory Planning, Power allocation, and Dynamic Time Division

Fangzhi Li, Zhichu Ren, Cunhua Pan et al.

To enhance the performance of aerial-ground networks, this paper proposes an integrated sensing and communication (ISAC) framework for multi-UAV systems. In our model, ground base stations (BSs) cooperatively serve multiple unmanned aerial vehicles (UAVs), employing a dynamic time-division strategy where beam scanning for sensing precedes data communication in each time slot. To maximize the sum communication rate while satisfying a mission-level cumulative radar mutual information (MI) requirement, we jointly optimize the UAV trajectories, communication and sensing power allocation, and the time-division ratio. The resulting highly coupled non-convex optimization problem is efficiently solved using an alternating optimization (AO) and successive convex approximation (SCA) framework, which yields a non-decreasing objective sequence and convergence to a finite objective value under the adopted surrogate-based iterative procedure. Extensive simulation results demonstrate that our proposed joint design significantly outperforms benchmark schemes with static trajectories, partially optimized resources, or non-cooperative single-BS transmission. Furthermore, a comprehensive sensitivity analysis reveals the distinct mechanisms by which sensing thresholds and the number of UAVs influence resource allocation and spatial organization, highlighting the critical importance of dynamic, multi-dimensional resource management for effectively navigating the sensing-communication trade-off in low-altitude economies.

24.3ITApr 27
A Framework for Uplink ISAC Receiver Designs: Performance Analysis and Algorithm Development

Zhiyuan Yu, Hong Ren, Cunhua Pan et al.

Uplink integrated sensing and communication (ISAC) systems have recently emerged as a promising research direction, enabling simultaneous uplink signal detection and target sensing. {In this paper, we propose the flexible projection (FP)-type receiver that unifies the projection-type receiver and the successive interference cancellation (SIC)-type receiver by using a flexible tradeoff factor to adapt to dynamically changing uplink ISAC scenarios.} The FP-type receiver addresses the joint signal detection and target response estimation problem through two coordinated phases: 1) Communication signal detection using a reconstructed signal whose composition is controlled by the tradeoff factor, followed by 2) Target response estimation performed through subtraction of the detected communication signal from the received signal. With adjustable tradeoff factors, the FP-type receiver can balance the enhancement of the signal-to-interference-plus-noise ratio (SINR) with the reduction of correlation in the reconstructed signal for communication signal detection. The pairwise error probability (PEP) expressions are analyzed for both the maximum likelihood (ML) and the zero-forcing (ZF) detectors, revealing that the optimal tradeoff factor should be determined based on the adopted detection algorithm and the relative power of the sensing and communication (S\&C) signals. A homotopy optimization framework is first applied for the FP-type receiver with a fixed tradeoff factor. This framework is then extended to develop the dynamic flexible projection (DFP)-type receiver, which iteratively adjusts the tradeoff factor for improved algorithm performance and environmental adaptability. Finally, we show that the length of the jointly processed signal should scale with the antenna size to fully unleash the potential of the uplink ISAC receiver.

LGJul 10, 2024
Resource Allocation for Twin Maintenance and Computing Task Processing in Digital Twin Vehicular Edge Computing Network

Yu Xie, Qiong Wu, Pingyi Fan et al.

As a promising technology, vehicular edge computing (VEC) can provide computing and caching services by deploying VEC servers near vehicles. However, VEC networks still face challenges such as high vehicle mobility. Digital twin (DT), an emerging technology, can predict, estimate, and analyze real-time states by digitally modeling objects in the physical world. By integrating DT with VEC, a virtual vehicle DT can be created in the VEC server to monitor the real-time operating status of vehicles. However, maintaining the vehicle DT model requires ongoing attention from the VEC server, which also needs to offer computing services for the vehicles. Therefore, effective allocation and scheduling of VEC server resources are crucial. This study focuses on a general VEC network with a single VEC service and multiple vehicles, examining the two types of delays caused by twin maintenance and computational processing within the network. By transforming the problem using satisfaction functions, we propose an optimization problem aimed at maximizing each vehicle's resource utility to determine the optimal resource allocation strategy. Given the non-convex nature of the issue, we employ multi-agent Markov decision processes to reformulate the problem. Subsequently, we propose the twin maintenance and computing task processing resource collaborative scheduling (MADRL-CSTC) algorithm, which leverages multi-agent deep reinforcement learning. Through experimental comparisons with alternative algorithms, it demonstrates that our proposed approach is effective in terms of resource allocation.

ITMar 11, 2023
Deep Reinforcement Learning Based Power Allocation for Minimizing AoI and Energy Consumption in MIMO-NOMA IoT Systems

Hongbiao Zhu, Qiong Wu, Qiang Fan et al.

Multi-input multi-out and non-orthogonal multiple access (MIMO-NOMA) internet-of-things (IoT) systems can improve channel capacity and spectrum efficiency distinctly to support the real-time applications. Age of information (AoI) is an important metric for real-time application, but there is no literature have minimized AoI of the MIMO-NOMA IoT system, which motivates us to conduct this work. In MIMO-NOMA IoT system, the base station (BS) determines the sample collection requirements and allocates the transmission power for each IoT device. Each device determines whether to sample data according to the sample collection requirements and adopts the allocated power to transmit the sampled data to the BS over MIMO-NOMA channel. Afterwards, the BS employs successive interference cancelation (SIC) technique to decode the signal of the data transmitted by each device. The sample collection requirements and power allocation would affect AoI and energy consumption of the system. It is critical to determine the optimal policy including sample collection requirements and power allocation to minimize the AoI and energy consumption of MIMO-NOMA IoT system, where the transmission rate is not a constant in the SIC process and the noise is stochastic in the MIMO-NOMA channel. In this paper, we propose the optimal power allocation to minimize the AoI and energy consumption of MIMO- NOMA IoT system based on deep reinforcement learning (DRL). Extensive simulations are carried out to demonstrate the superiority of the optimal power allocation.

LGAug 3, 2022
Asynchronous Federated Learning for Edge-assisted Vehicular Networks

Siyuan Wang, Qiong Wu, Qiang Fan et al.

Vehicular networks enable vehicles support real-time vehicular applications through training data. Due to the limited computing capability, vehicles usually transmit data to a road side unit (RSU) at the network edge to process data. However, vehicles are usually reluctant to share data with each other due to the privacy issue. For the traditional federated learning (FL), vehicles train the data locally to obtain a local model and then upload the local model to the RSU to update the global model, thus the data privacy can be protected through sharing model parameters instead of data. The traditional FL updates the global model synchronously, i.e., the RSU needs to wait for all vehicles to upload their models for the global model updating. However, vehicles may usually drive out of the coverage of the RSU before they obtain their local models through training, which reduces the accuracy of the global model. It is necessary to propose an asynchronous federated learning (AFL) to solve this problem, where the RSU updates the global model once it receives a local model from a vehicle. However, the amount of data, computing capability and vehicle mobility may affect the accuracy of the global model. In this paper, we jointly consider the amount of data, computing capability and vehicle mobility to design an AFL scheme to improve the accuracy of the global model. Extensive simulation experiments have demonstrated that our scheme outperforms the FL scheme

97.9ITMar 10
Tensor Train Decomposition-based Channel Estimation for MIMO-AFDM Systems with Fractional Delay and Doppler

Ruizhe Wang, Cunhua Pan, Hong Ren et al.

Affine Frequency Division Multiplexing (AFDM) has emerged as a promising chirp-based multicarrier technology for high-speed communication systems. To fully exploit the diversity gain offered by AFDM, accurate channel estimation is essential. However, existing studies have mainly focused on the integer-delay-tap scenario and single-symbol pilot-based estimation. Since delay taps in practice are generally fractional, approximating them as integers not only degrades delay estimation accuracy but also severely affects Doppler frequency estimation. To address this problem, in this paper, we investigate channel estimation for multiple-input multiple-output (MIMO)-AFDM systems. A time-affine frequency (T-AF) domain pilot structure is proposed to exploit time-domain phase variations. By leveraging the rotational invariance property in the spatial and temporal domains, a channel estimation algorithm based on Vandermonde-structured tensor-train (TT) decomposition is developed. The proposed algorithm demonstrates superior computational efficiency compared with state-of-the-art parameter estimation methods. Moreover, diverging from current studies, we derive the global Ziv-Zakai bound (ZZB) as an alternative parameter estimation error lower bound to the Cramér-Rao bound (CRB). Numerical results show that the derived ZZB provides tighter global performance characterization and successfully captures the threshold phenomenon in mean square error (MSE) performance in the low-SNR regime. Furthermore, the proposed algorithm achieves superior communication performance relative to the existing schemes, while offering a computational speedup, reducing the execution time by an order of magnitude compared to the state-of-the-art iterative algorithms.

CVDec 4, 2025
WiFi-based Cross-Domain Gesture Recognition Using Attention Mechanism

Ruijing Liu, Cunhua Pan, Jiaming Zeng et al.

While fulfilling communication tasks, wireless signals can also be used to sense the environment. Among various types of sensing media, WiFi signals offer advantages such as widespread availability, low hardware cost, and strong robustness to environmental conditions like light, temperature, and humidity. By analyzing Wi-Fi signals in the environment, it is possible to capture dynamic changes of the human body and accomplish sensing applications such as gesture recognition. Although many existing gesture sensing solutions perform well in-domain but lack cross-domain capabilities (i.e., recognition performance in untrained environments). To address this, we extract Doppler spectra from the channel state information (CSI) received by all receivers and concatenate each Doppler spectrum along the same time axis to generate fused images with multi-angle information as input features. Furthermore, inspired by the convolutional block attention module (CBAM), we propose a gesture recognition network that integrates a multi-semantic spatial attention mechanism with a self-attention-based channel mechanism. This network constructs attention maps to quantify the spatiotemporal features of gestures in images, enabling the extraction of key domain-independent features. Additionally, ResNet18 is employed as the backbone network to further capture deep-level features. To validate the network performance, we evaluate the proposed network on the public Widar3 dataset, and the results show that it not only maintains high in-domain accuracy of 99.72%, but also achieves high performance in cross-domain recognition of 97.61%, significantly outperforming existing best solutions.

41.3SPApr 23
Robust Cross-Domain WiFi Fall Detection via Physics-Driven Attention-Enhanced Transformers

Yingzhe Wang, Cunhua Pan, Ruijing Liu et al.

Device-free fall detection utilizing WiFi Channel State Information (CSI) has emerged as a promising, privacy-preserving solution for elderly health monitoring in the Internet of Things (IoT) era. However, existing deep learning approaches suffer from severe performance degradation when deployed in unseen environments due to static background overfitting and Non-Line-of-Sight (NLoS) signal attenuation. To address these critical bottlenecks, we propose a robust, domain-generalizable framework featuring a novel Attention-Enhanced CNN-Transformer hybrid architecture. First, we design a physics-driven \textbf{Dynamic Variance Gate (DVG)} to dynamically calculate local temporal variance, acting as a soft-attention mask that eliminates static environmental DC components while amplifying dynamic human motion. Second, we introduce a Physics-Aware Data Augmentation strategy to force the network to learn invariant morphological signatures rather than environment-specific noise. Furthermore, a Convolutional Block Attention Module (CBAM) is integrated to refine spatiotemporal features prior to Transformer-based sequence modeling. Extensive cross-domain evaluations across four distinct indoor environments demonstrate that our method achieves 97.6\% accuracy in NLoS scenarios and 98.8\% in completely unseen environments without target-domain fine-tuning. Finally, we deploy the proposed framework on an edge computing system equipped with commercial WiFi NICs. Real-world live inference field tests confirm the system's robustness against unseen environmental layouts and its capability for continuous, low-latency whole-home safety monitoring.

39.5ITMar 26
AMBER: An Adaptive Multimodal Mask Transformer for Beam Prediction with Missing Modalities

Chenyiming Wen, Binpu Shi, Min Li et al.

With the widespread adoption of millimeter-wave (mmWave) massive multi-input-multi-output (MIMO) in vehicular networks, accurate beam prediction and alignment have become critical for high-speed data transmission and reliable access. While traditional beam prediction approaches primarily rely on in-band beam training, recent advances have started to explore multimodal sensing to extract environmental semantics for enhanced prediction. However, the performance of existing multimodal fusion methods degrades significantly in real-world settings because they are vulnerable to missing data caused by sensor blockage, poor lighting, or GPS dropouts. To address this challenge, we propose AMBER ({A}daptive multimodal {M}ask transformer for {BE}am p{R}ediction), a novel end-to-end framework that processes temporal sequences of image, LiDAR, radar, and GPS data, while adaptively handling arbitrary missing-modality cases. AMBER introduces learnable modality tokens and a missing-modality-aware mask to prevent cross-modal noise propagation, along with a learnable fusion token and multihead attention to achieve robust modality-specific information distillation and feature-level fusion. Furthermore, a class-former-aided modality alignment (CMA) module and temporal-aware positional embedding are incorporated to preserve temporal coherence and ensure semantic alignment across modalities, facilitating the learning of modality-invariant and temporally consistent representations for beam prediction. Extensive experiments on the real-world DeepSense6G dataset demonstrate that AMBER significantly outperforms existing multimodal learning baselines. In particular, it maintains high beam prediction accuracy and robustness even under severe missing-modality scenarios, validating its effectiveness and practical applicability.

SPJul 1, 2024
Channel Modeling Aided Dataset Generation for AI-Enabled CSI Feedback: Advances, Challenges, and Solutions

Yupeng Li, Gang Li, Zirui Wen et al.

The AI-enabled autoencoder has demonstrated great potential in channel state information (CSI) feedback in frequency division duplex (FDD) multiple input multiple output (MIMO) systems. However, this method completely changes the existing feedback strategies, making it impractical to deploy in recent years. To address this issue, this paper proposes a channel modeling aided data augmentation method based on a limited number of field channel data. Specifically, the user equipment (UE) extracts the primary stochastic parameters of the field channel data and transmits them to the base station (BS). The BS then updates the typical TR 38.901 model parameters with the extracted parameters. In this way, the updated channel model is used to generate the dataset. This strategy comprehensively considers the dataset collection, model generalization, model monitoring, and so on. Simulations verify that our proposed strategy can significantly improve performance compared to the benchmarks.

SPAug 1, 2024
Augmenting Channel Simulator and Semi- Supervised Learning for Efficient Indoor Positioning

Yupeng Li, Xinyu Ning, Shijian Gao et al.

This work aims to tackle the labor-intensive and resource-consuming task of indoor positioning by proposing an efficient approach. The proposed approach involves the introduction of a semi-supervised learning (SSL) with a biased teacher (SSLB) algorithm, which effectively utilizes both labeled and unlabeled channel data. To reduce measurement expenses, unlabeled data is generated using an updated channel simulator (UCHS), and then weighted by adaptive confidence values to simplify the tuning of hyperparameters. Simulation results demonstrate that the proposed strategy achieves superior performance while minimizing measurement overhead and training expense compared to existing benchmarks, offering a valuable and practical solution for indoor positioning.

77.9ITMay 7
Near-field Channel Estimation for XL-RIS-aided mmWave MIMO Systems

Erkang Dong, Taihao Zhang, Cunhua Pan et al.

Extremely large-scale reconfigurable intelligent surfaces (XL-RISs) have emerged as a promising technology for millimeter-wave (mmWave) communications. However, the exceedingly large aperture of XL-RISs renders the RIS-user links likely to operate in the near-field region, where the conventional planar-wave assumption and angular-domain sparse representation become invalid, thus making channel estimation significantly more challenging. In this paper, we investigate cascaded channel estimation for an XL-RIS-aided multi-user multiple-input multiple-output (MU-MIMO) system, in which the BS-RIS channel is modeled in the far field, while the RIS-user channels exhibit near-field spherical-wave characteristics. To tackle the resulting hybrid-field estimation problem, we propose a low-overhead two-stage channel estimation scheme by jointly exploiting the common BS-RIS link shared by all users and the polar-domain sparsity of the RIS-user channels. Specifically, the multi-antenna users are firstly decomposed into multiple virtual single-antenna users, based on which the common BS-RIS parameters are extracted from a typical virtual user and the RIS-user channels are initialized via compensated polar-domain sparse recovery. Then, an alternating least-squares refinement procedure is developed to jointly improve the common BS-RIS operator and the user-specific RIS-side channels. Simulation results show that the proposed scheme achieves competitive channel estimation performance with substantially reduced pilot overhead compared with the existing near-field benchmarks.

LGApr 12, 2024
Anti-Byzantine Attacks Enabled Vehicle Selection for Asynchronous Federated Learning in Vehicular Edge Computing

Cui Zhang, Xiao Xu, Qiong Wu et al.

In vehicle edge computing (VEC), asynchronous federated learning (AFL) is used, where the edge receives a local model and updates the global model, effectively reducing the global aggregation latency.Due to different amounts of local data,computing capabilities and locations of the vehicles, renewing the global model with same weight is inappropriate.The above factors will affect the local calculation time and upload time of the local model, and the vehicle may also be affected by Byzantine attacks, leading to the deterioration of the vehicle data. However, based on deep reinforcement learning (DRL), we can consider these factors comprehensively to eliminate vehicles with poor performance as much as possible and exclude vehicles that have suffered Byzantine attacks before AFL. At the same time, when aggregating AFL, we can focus on those vehicles with better performance to improve the accuracy and safety of the system. In this paper, we proposed a vehicle selection scheme based on DRL in VEC. In this scheme, vehicle s mobility, channel conditions with temporal variations, computational resources with temporal variations, different data amount, transmission channel status of vehicles as well as Byzantine attacks were taken into account.Simulation results show that the proposed scheme effectively improves the safety and accuracy of the global model.

62.2DCApr 24
Network Edge Inference for Large Language Models: Principles, Techniques, and Opportunities

Zhixiong Chen, Bingjie Zhu, Jiangzhou Wang et al.

Large language models (LLMs) have advanced rapidly, emerging as versatile tools across fields thanks to their exceptional language understanding, generation, and reasoning capabilities. However, performing LLM inference at the network edge remains challenging due to their large memory and compute demands. This survey outlines the challenges specific to LLM edge inference and provides a comprehensive overview of recent progress, covering system architectures, model optimization and deployment, and resource management and scheduling. By synthesizing state-of-the-art techniques and mapping future directions, this survey aims to unlock the potential of LLMs in resource-constrained edge environments.

SPMar 26, 2024
Multi-stream Transmission for Directional Modulation Network via Distributed Multi-UAV-aided Multi-active-IRS

Ke Yang, Rongen Dong, Wei Gao et al.

Active intelligent reflecting surface (IRS) is a revolutionary technique for the future 6G networks. The conventional far-field single-IRS-aided directional modulation(DM) networks have only one (no direct path) or two (existing direct path) degrees of freedom (DoFs). This means that there are only one or two streams transmitted simultaneously from base station to user and will seriously limit its rate gain achieved by IRS. How to create multiple DoFs more than two for DM? In this paper, single large-scale IRS is divided to multiple small IRSs and a novel multi-IRS-aided multi-stream DM network is proposed to achieve a point-to-point multi-stream transmission by creating $K$ ($\geq3$) DoFs, where multiple small IRSs are placed distributively via multiple unmanned aerial vehicles (UAVs). The null-space projection, zero-forcing (ZF) and phase alignment are adopted to design the transmit beamforming vector, receive beamforming vector and phase shift matrix (PSM), respectively, called NSP-ZF-PA. Here, $K$ PSMs and their corresponding beamforming vectors are independently optimized. The weighted minimum mean-square error (WMMSE) algorithm is involved in alternating iteration for the optimization variables by introducing the power constraint on IRS, named WMMSE-PC, where the majorization-minimization (MM) algorithm is used to solve the total PSM. To achieve a lower computational complexity, a maximum trace method, called Max-TR-SVD, is proposed by optimize the PSM of all IRSs. Numerical simulation results has shown that the proposed NSP-ZF-PA performs much better than Max-TR-SVD in terms of rate. In particular, the rate of NSP-ZF-PA with sixteen small IRSs is about five times that of NSP-ZF-PA with combining all small IRSs as a single large IRS. Thus, a dramatic rate enhancement may be achieved by multiple distributed IRSs.

ITNov 6, 2024
Large Generative Model-assisted Talking-face Semantic Communication System

Feibo Jiang, Siwei Tu, Li Dong et al.

The rapid development of generative Artificial Intelligence (AI) continually unveils the potential of Semantic Communication (SemCom). However, current talking-face SemCom systems still encounter challenges such as low bandwidth utilization, semantic ambiguity, and diminished Quality of Experience (QoE). This study introduces a Large Generative Model-assisted Talking-face Semantic Communication (LGM-TSC) System tailored for the talking-face video communication. Firstly, we introduce a Generative Semantic Extractor (GSE) at the transmitter based on the FunASR model to convert semantically sparse talking-face videos into texts with high information density. Secondly, we establish a private Knowledge Base (KB) based on the Large Language Model (LLM) for semantic disambiguation and correction, complemented by a joint knowledge base-semantic-channel coding scheme. Finally, at the receiver, we propose a Generative Semantic Reconstructor (GSR) that utilizes BERT-VITS2 and SadTalker models to transform text back into a high-QoE talking-face video matching the user's timbre. Simulation results demonstrate the feasibility and effectiveness of the proposed LGM-TSC system.

NIMar 18, 2025
Multi-user Wireless Image Semantic Transmission over MIMO Multiple Access Channels

Bingyan Xie, Yongpeng Wu, Feng Shu et al.

This paper focuses on a typical uplink transmission scenario over multiple-input multiple-output multiple access channel (MIMO-MAC) and thus propose a multi-user learnable CSI fusion semantic communication (MU-LCFSC) framework. It incorporates CSI as the side information into both the semantic encoders and decoders to generate a proper feature mask map in order to produce a more robust attention weight distribution. Especially for the decoding end, a cooperative successive interference cancellation procedure is conducted along with a cooperative mask ratio generator, which flexibly controls the mask elements of feature mask maps. Numerical results verify the superiority of proposed MU-LCFSC compared to DeepJSCC-NOMA over 3 dB in terms of PSNR.

SYSep 17, 2025
Large Language Model-Empowered Decision Transformer for UAV-Enabled Data Collection

Zhixion Chen, Jiangzhou Wang, Hyundong Shin et al.

The deployment of unmanned aerial vehicles (UAVs) for reliable and energy-efficient data collection from spatially distributed devices holds great promise in supporting diverse Internet of Things (IoT) applications. Nevertheless, the limited endurance and communication range of UAVs necessitate intelligent trajectory planning. While reinforcement learning (RL) has been extensively explored for UAV trajectory optimization, its interactive nature entails high costs and risks in real-world environments. Offline RL mitigates these issues but remains susceptible to unstable training and heavily rely on expert-quality datasets. To address these challenges, we formulate a joint UAV trajectory planning and resource allocation problem to maximize energy efficiency of data collection. The resource allocation subproblem is first transformed into an equivalent linear programming formulation and solved optimally with polynomial-time complexity. Then, we propose a large language model (LLM)-empowered critic-regularized decision transformer (DT) framework, termed LLM-CRDT, to learn effective UAV control policies. In LLM-CRDT, we incorporate critic networks to regularize the DT model training, thereby integrating the sequence modeling capabilities of DT with critic-based value guidance to enable learning effective policies from suboptimal datasets. Furthermore, to mitigate the data-hungry nature of transformer models, we employ a pre-trained LLM as the transformer backbone of the DT model and adopt a parameter-efficient fine-tuning strategy, i.e., LoRA, enabling rapid adaptation to UAV control tasks with small-scale dataset and low computational overhead. Extensive simulations demonstrate that LLM-CRDT outperforms benchmark online and offline RL methods, achieving up to 36.7\% higher energy efficiency than the current state-of-the-art DT approaches.

SPJun 29, 2025
Multi-Branch DNN and CRLB-Ratio-Weight Fusion for Enhanced DOA Sensing via a Massive H$^2$AD MIMO Receiver

Feng Shu, Jiatong Bai, Di Wu et al.

As a green MIMO structure, massive H$^2$AD is viewed as a potential technology for the future 6G wireless network. For such a structure, it is a challenging task to design a low-complexity and high-performance fusion of target direction values sensed by different sub-array groups with fewer use of prior knowledge. To address this issue, a lightweight Cramer-Rao lower bound (CRLB)-ratio-weight fusion (WF) method is proposed, which approximates inverse CRLB of each subarray using antenna number reciprocals to eliminate real-time CRLB computation. This reduces complexity and prior knowledge dependence while preserving fusion performance. Moreover, a multi-branch deep neural network (MBDNN) is constructed to further enhance direction-of-arrival (DOA) sensing by leveraging candidate angles from multiple subarrays. The subarray-specific branch networks are integrated with a shared regression module to effectively eliminate pseudo-solutions and fuse true angles. Simulation results show that the proposed CRLB-ratio-WF method achieves DOA sensing performance comparable to CRLB-based methods, while significantly reducing the reliance on prior knowledge. More notably, the proposed MBDNN has superior performance in low-SNR ranges. At SNR $= -15$ dB, it achieves an order-of-magnitude improvement in estimation accuracy compared to CRLB-ratio-WF method.

MAJun 17, 2024
Reconfigurable Intelligent Surface Assisted VEC Based on Multi-Agent Reinforcement Learning

Kangwei Qi, Qiong Wu, Pingyi Fan et al.

Vehicular edge computing (VEC) is an emerging technology that enables vehicles to perform high-intensity tasks by executing tasks locally or offloading them to nearby edge devices. However, obstacles such as buildings may degrade the communications and incur communication interruptions, and thus the vehicle may not meet the requirement for task offloading. Reconfigurable intelligent surfaces (RIS) is introduced to support vehicle communication and provide an alternative communication path. The system performance can be improved by flexibly adjusting the phase-shift of the RIS. For RIS-assisted VEC system where tasks arrive randomly, we design a control scheme that considers offloading power, local power allocation and phase-shift optimization. To solve this non-convex problem, we propose a new deep reinforcement learning (DRL) framework that employs modified multi-agent deep deterministic policy gradient (MADDPG) approach to optimize the power allocation for vehicle users (VUs) and block coordinate descent (BCD) algorithm to optimize the phase-shift of the RIS. Simulation results show that our proposed scheme outperforms the centralized deep deterministic policy gradient (DDPG) scheme and random scheme.

LGJun 17, 2024
Deep-Reinforcement-Learning-Based AoI-Aware Resource Allocation for RIS-Aided IoV Networks

Kangwei Qi, Qiong Wu, Pingyi Fan et al.

Reconfigurable Intelligent Surface (RIS) is a pivotal technology in communication, offering an alternative path that significantly enhances the link quality in wireless communication environments. In this paper, we propose a RIS-assisted internet of vehicles (IoV) network, considering the vehicle-to-everything (V2X) communication method. In addition, in order to improve the timeliness of vehicle-to-infrastructure (V2I) links and the stability of vehicle-to-vehicle (V2V) links, we introduce the age of information (AoI) model and the payload transmission probability model. Therefore, with the objective of minimizing the AoI of V2I links and prioritizing transmission of V2V links payload, we construct this optimization problem as an Markov decision process (MDP) problem in which the BS serves as an agent to allocate resources and control phase-shift for the vehicles using the soft actor-critic (SAC) algorithm, which gradually converges and maintains a high stability. A AoI-aware joint vehicular resource allocation and RIS phase-shift control scheme based on SAC algorithm is proposed and simulation results show that its convergence speed, cumulative reward, AoI performance, and payload transmission probability outperforms those of proximal policy optimization (PPO), deep deterministic policy gradient (DDPG), twin delayed deep deterministic policy gradient (TD3) and stochastic algorithms.

LGJun 11, 2024
Semantic-Aware Spectrum Sharing in Internet of Vehicles Based on Deep Reinforcement Learning

Zhiyu Shao, Qiong Wu, Pingyi Fan et al.

This work aims to investigate semantic communication in high-speed mobile Internet of vehicles (IoV) environments, with a focus on the spectrum sharing between vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communications. We specifically address spectrum scarcity and network traffic and then propose a semantic-aware spectrum sharing algorithm (SSS) based on the deep reinforcement learning (DRL) soft actor-critic (SAC) approach. Firstly, we delve into the extraction of semantic information. Secondly, we redefine metrics for semantic information in V2V and V2I spectrum sharing in IoV environments, introducing high-speed semantic spectrum efficiency (HSSE) and semantic transmission rate (HSR). Finally, we employ the SAC algorithm for decision optimization in V2V and V2I spectrum sharing based on semantic information. This optimization encompasses the optimal link of V2V and V2I sharing strategies, the transmission power for vehicles sending semantic information and the length of transmitted semantic symbols, aiming at maximizing HSSE of V2I and enhancing success rate of effective semantic information transmission (SRS) of V2V. Experimental results demonstrate that the SSS algorithm outperforms other baseline algorithms, including other traditional-communication-based spectrum sharing algorithms and spectrum sharing algorithm using other reinforcement learning approaches. The SSS algorithm exhibits a 15% increase in HSSE and approximately a 7% increase in SRS.

LGMay 5, 2023
Over-the-Air Federated Averaging with Limited Power and Privacy Budgets

Na Yan, Kezhi Wang, Cunhua Pan et al.

To jointly overcome the communication bottleneck and privacy leakage of wireless federated learning (FL), this paper studies a differentially private over-the-air federated averaging (DP-OTA-FedAvg) system with a limited sum power budget. With DP-OTA-FedAvg, the gradients are aligned by an alignment coefficient and aggregated over the air, and channel noise is employed to protect privacy. We aim to improve the learning performance by jointly designing the device scheduling, alignment coefficient, and the number of aggregation rounds of federated averaging (FedAvg) subject to sum power and privacy constraints. We first present the privacy analysis based on differential privacy (DP) to quantify the impact of the alignment coefficient on privacy preservation in each communication round. Furthermore, to study how the device scheduling, alignment coefficient, and the number of the global aggregation affect the learning process, we conduct the convergence analysis of DP-OTA-FedAvg in the cases of convex and non-convex loss functions. Based on these analytical results, we formulate an optimization problem to minimize the optimality gap of the DP-OTA-FedAvg subject to limited sum power and privacy budgets. The problem is solved by decoupling it into two sub-problems. Given the number of communication rounds, we conclude the relationship between the number of scheduled devices and the alignment coefficient, which offers a set of potential optimal solution pairs of device scheduling and the alignment coefficient. Thanks to the reduced search space, the optimal solution can be efficiently obtained. The effectiveness of the proposed policy is validated through simulations.

CRDec 29, 2021
Physical Layer Security Techniques for Future Wireless Networks

Weiping Shi, Xinyi Jiang, Jinsong Hu et al.

The broadcast nature of wireless communication systems makes wireless transmission extremely susceptible to eavesdropping and even malicious interference. Physical layer security technology can effectively protect the private information sent by the transmitter from being listened to by illegal eavesdroppers, thus ensuring the privacy and security of communication between the transmitter and legitimate users. The development of mobile communication presents new challenges to physical layer security research. This paper provides a comprehensive survey of the physical layer security research on various promising mobile technologies, including directional modulation (DM), spatial modulation (SM), covert communication, intelligent reflecting surface (IRS)-aided communication, and so on. Finally, future trends and the unresolved technical challenges are summarized in physical layer security for mobile communications.

MENov 1, 2020
Fast Network Community Detection with Profile-Pseudo Likelihood Methods

Jiangzhou Wang, Jingfei Zhang, Binghui Liu et al.

The stochastic block model is one of the most studied network models for community detection. It is well-known that most algorithms proposed for fitting the stochastic block model likelihood function cannot scale to large-scale networks. One prominent work that overcomes this computational challenge is Amini et al.(2013), which proposed a fast pseudo-likelihood approach for fitting stochastic block models to large sparse networks. However, this approach does not have convergence guarantee, and is not well suited for small- or medium- scale networks. In this article, we propose a novel likelihood based approach that decouples row and column labels in the likelihood function, which enables a fast alternating maximization; the new method is computationally efficient, performs well for both small and large scale networks, and has provable convergence guarantee. We show that our method provides strongly consistent estimates of the communities in a stochastic block model. As demonstrated in simulation studies, the proposed method outperforms the pseudo-likelihood approach in terms of both estimation accuracy and computation efficiency, especially for large sparse networks. We further consider extensions of our proposed method to handle networks with degree heterogeneity and bipartite properties.

ITFeb 4, 2018
Power Allocation Strategy of Maximizing Secrecy Rate for Secure Directional Modulation Networks

Simin Wan, Feng Shu, Jinhui Lu et al.

In this paper, given the beamforming vector of confidential messages and artificial noise (AN) projection matrix and total power constraint, a power allocation (PA) strategy of maximizing secrecy rate (Max-SR) is proposed for secure directional modulation (DM) networks. By the method of Lagrange multiplier, the analytic expression of the proposed PA strategy is derived. To confirm the benefit from the Max-SR-based PA strategy, we take the null-space projection (NSP) beamforming scheme as an example and derive its closed-form expression of optimal PA strategy. From simulation results, we find the following facts: in the medium and high signal-to-noise-ratio (SNR) regions, compared with three typical PA parameters such $β=0.1, 0.5$, and $0.9$, the optimal PA shows a substantial SR performance gain with maximum gain percent up to more than $60\%$. Additionally, as the PA factor increases from 0 to 1, the achievable SR increases accordingly in the low SNR region whereas it first increases and then decreases in the medium and high SNR regions, where the SR can be approximately viewed as a convex function of the PA factor. Finally, as the number of antennas increases, the optimal PA factor becomes large and tends to one in the medium and high SNR region. In other words, the contribution of AN to SR can be trivial in such a situation.

ITJan 15, 2018
Two High-performance Schemes of Transmit Antenna Selection for Secure Spatial Modulation

Feng Shu, Zhengwang Wang, Riqing Chen et al.

In this paper, a secure spatial modulation (SM) system with artificial noise (AN)-aided is investigated. To achieve higher secrecy rate (SR) in such a system, two high-performance schemes of transmit antenna selection (TAS), leakage-based and maximum secrecy rate (Max-SR), are proposed and a generalized Euclidean distance-optimized antenna selection (EDAS) method is designed. From simulation results and analysis, the four TAS schemes have an decreasing order: Max-SR, leakage-based, generalized EDAS, and random (conventional), in terms of SR performance. However, the proposed Max-SR method requires the exhaustive search to achieve the optimal SR performance, thus its complexity is extremely high as the number of antennas tends to medium and large scale. The proposed leakage-based method approaches the Max-SR method with much lower complexity. Thus, it achieves a good balance between complexity and SR performance. In terms of bit error rate (BER), their performances are in an increasing order: random, leakage-based, Max-SR, and generalized EDAS.

LGDec 16, 2017
A Machine Learning Framework for Resource Allocation Assisted by Cloud Computing

Jun-Bo Wang, Junyuan Wang, Yongpeng Wu et al.

Conventionally, the resource allocation is formulated as an optimization problem and solved online with instantaneous scenario information. Since most resource allocation problems are not convex, the optimal solutions are very difficult to be obtained in real time. Lagrangian relaxation or greedy methods are then often employed, which results in performance loss. Therefore, the conventional methods of resource allocation are facing great challenges to meet the ever-increasing QoS requirements of users with scarce radio resource. Assisted by cloud computing, a huge amount of historical data on scenarios can be collected for extracting similarities among scenarios using machine learning. Moreover, optimal or near-optimal solutions of historical scenarios can be searched offline and stored in advance. When the measured data of current scenario arrives, the current scenario is compared with historical scenarios to find the most similar one. Then, the optimal or near-optimal solution in the most similar historical scenario is adopted to allocate the radio resources for the current scenario. To facilitate the application of new design philosophy, a machine learning framework is proposed for resource allocation assisted by cloud computing. An example of beam allocation in multi-user massive multiple-input-multiple-output (MIMO) systems shows that the proposed machine-learning based resource allocation outperforms conventional methods.