Xiaohu You

LG
h-index116
28papers
854citations
Novelty46%
AI Score53

28 Papers

AIJul 7, 2023
Large AI Model-Based Semantic Communications

Feibo Jiang, Yubo Peng, Li Dong et al.

Semantic communication (SC) is an emerging intelligent paradigm, offering solutions for various future applications like metaverse, mixed reality, and the Internet of Everything. However, in current SC systems, the construction of the knowledge base (KB) faces several issues, including limited knowledge representation, frequent knowledge updates, and insecure knowledge sharing. Fortunately, the development of the large AI model (LAM) provides new solutions to overcome the above issues. Here, we propose a LAM-based SC framework (LAM-SC) specifically designed for image data, where we first apply the segment anything model (SAM)-based KB (SKB) that can split the original image into different semantic segments by universal semantic knowledge. Then, we present an attention-based semantic integration (ASI) to weigh the semantic segments generated by SKB without human participation and integrate them as the semantic aware image. Additionally, we propose an adaptive semantic compression (ASC) encoding to remove redundant information in semantic features, thereby reducing communication overhead. Finally, through simulations, we demonstrate the effectiveness of the LAM-SC framework and the possibility of applying the LAM-based KB in future SC paradigms.

LGJun 13, 2022
Computation Offloading and Resource Allocation in F-RANs: A Federated Deep Reinforcement Learning Approach

Lingling Zhang, Yanxiang Jiang, Fu-Chun Zheng et al.

The fog radio access network (F-RAN) is a promising technology in which the user mobile devices (MDs) can offload computation tasks to the nearby fog access points (F-APs). Due to the limited resource of F-APs, it is important to design an efficient task offloading scheme. In this paper, by considering time-varying network environment, a dynamic computation offloading and resource allocation problem in F-RANs is formulated to minimize the task execution delay and energy consumption of MDs. To solve the problem, a federated deep reinforcement learning (DRL) based algorithm is proposed, where the deep deterministic policy gradient (DDPG) algorithm performs computation offloading and resource allocation in each F-AP. Federated learning is exploited to train the DDPG agents in order to decrease the computing complexity of training process and protect the user privacy. Simulation results show that the proposed federated DDPG algorithm can achieve lower task execution delay and energy consumption of MDs more quickly compared with the other existing strategies.

NINov 28, 2023
Digital Twin-Enhanced Deep Reinforcement Learning for Resource Management in Networks Slicing

Zhengming Zhang, Yongming Huang, Cheng Zhang et al.

Network slicing-based communication systems can dynamically and efficiently allocate resources for diversified services. However, due to the limitation of the network interface on channel access and the complexity of the resource allocation, it is challenging to achieve an acceptable solution in the practical system without precise prior knowledge of the dynamics probability model of the service requests. Existing work attempts to solve this problem using deep reinforcement learning (DRL), however, such methods usually require a lot of interaction with the real environment in order to achieve good results. In this paper, a framework consisting of a digital twin and reinforcement learning agents is present to handle the issue. Specifically, we propose to use the historical data and the neural networks to build a digital twin model to simulate the state variation law of the real environment. Then, we use the data generated by the network slicing environment to calibrate the digital twin so that it is in sync with the real environment. Finally, DRL for slice optimization optimizes its own performance in this virtual pre-verification environment. We conducted an exhaustive verification of the proposed digital twin framework to confirm its scalability. Specifically, we propose to use loss landscapes to visualize the generalization of DRL solutions. We explore a distillation-based optimization scheme for lightweight slicing strategies. In addition, we also extend the framework to offline reinforcement learning, where solutions can be used to obtain intelligent decisions based solely on historical data. Numerical simulation experiments show that the proposed digital twin can significantly improve the performance of the slice optimization strategy.

LGJun 23, 2022
Content Popularity Prediction Based on Quantized Federated Bayesian Learning in Fog Radio Access Networks

Yunwei Tao, Yanxiang Jiang, Fu-Chun Zheng et al.

In this paper, we investigate the content popularity prediction problem in cache-enabled fog radio access networks (F-RANs). In order to predict the content popularity with high accuracy and low complexity, we propose a Gaussian process based regressor to model the content request pattern. Firstly, the relationship between content features and popularity is captured by our proposed model. Then, we utilize Bayesian learning to train the model parameters, which is robust to overfitting. However, Bayesian methods are usually unable to find a closed-form expression of the posterior distribution. To tackle this issue, we apply a stochastic variance reduced gradient Hamiltonian Monte Carlo (SVRG-HMC) method to approximate the posterior distribution. To utilize the computing resources of other fog access points (F-APs) and to reduce the communications overhead, we propose a quantized federated learning (FL) framework combining with Bayesian learning. The quantized federated Bayesian learning framework allows each F-AP to send gradients to the cloud server after quantizing and encoding. It can achieve a tradeoff between prediction accuracy and communications overhead effectively. Simulation results show that the performance of our proposed policy outperforms the existing policies.

LGJun 13, 2022
Content Popularity Prediction in Fog-RANs: A Clustered Federated Learning Based Approach

Zhiheng Wang, Yanxiang Jiang, Fu-Chun Zheng et al.

In this paper, the content popularity prediction problem in fog radio access networks (F-RANs) is investigated. Based on clustered federated learning, we propose a novel mobility-aware popularity prediction policy, which integrates content popularities in terms of local users and mobile users. For local users, the content popularity is predicted by learning the hidden representations of local users and contents. Initial features of local users and contents are generated by incorporating neighbor information with self information. Then, dual-channel neural network (DCNN) model is introduced to learn the hidden representations by producing deep latent features from initial features. For mobile users, the content popularity is predicted via user preference learning. In order to distinguish regional variations of content popularity, clustered federated learning (CFL) is employed, which enables fog access points (F-APs) with similar regional types to benefit from one another and provides a more specialized DCNN model for each F-AP. Simulation results show that our proposed policy achieves significant performance improvement over the traditional policies.

SPFeb 15, 2018
Residual-Based Detections and Unified Architecture for Massive MIMO Uplink

Chuan Zhang, Yufeng Yang, Shunqing Zhang et al.

Massive multiple-input multiple-output (M-MIMO) technique brings better energy efficiency and coverage but higher computational complexity than small-scale MIMO. For linear detections such as minimum mean square error (MMSE), prohibitive complexity lies in solving large-scale linear equations. For a better trade-off between bit-error-rate (BER) performance and computational complexity, iterative linear algorithms like conjugate gradient (CG) have been applied and have shown their feasibility in recent years. In this paper, residual-based detection (RBD) algorithms are proposed for M-MIMO detection, including minimal residual (MINRES) algorithm, generalized minimal residual (GMRES) algorithm, and conjugate residual (CR) algorithm. RBD algorithms focus on the minimization of residual norm per iteration, whereas most existing algorithms focus on the approximation of exact signal. Numerical results have shown that, for $64$-QAM $128\times 8$ MIMO, RBD algorithms are only $0.13$ dB away from the exact matrix inversion method when BER$=10^{-4}$. Stability of RBD algorithms has also been verified in various correlation conditions. Complexity comparison has shown that, CR algorithm require $87\%$ less complexity than the traditional method for $128\times 60$ MIMO. The unified hardware architecture is proposed with flexibility, which guarantees a low-complexity implementation for a family of RBD M-MIMO detectors.

NINov 29, 2023
Wireless Network Digital Twin for 6G: Generative AI as A Key Enabler

Zhenyu Tao, Wei Xu, Yongming Huang et al.

Digital twin, which enables emulation, evaluation, and optimization of physical entities through synchronized digital replicas, has gained increasing attention as a promising technology for intricate wireless networks. For 6G, numerous innovative wireless technologies and network architectures have posed new challenges in establishing wireless network digital twins. To tackle these challenges, artificial intelligence (AI), particularly the flourishing generative AI, emerges as a potential solution. In this article, we discuss emerging prerequisites for wireless network digital twins considering the complicated network architecture, tremendous network scale, extensive coverage, and diversified application scenarios in the 6G era. We further explore the applications of generative AI, such as Transformer and diffusion model, to empower the 6G digital twin from multiple perspectives including physical-digital modeling, synchronization, and slicing capability. Subsequently, we propose a hierarchical generative AI-enabled wireless network digital twin at both the message-level and policy-level, and provide a typical use case with numerical results to validate the effectiveness and efficiency. Finally, open research issues for wireless network digital twins in the 6G era are discussed.

LGOct 7, 2023
Digital Twin Assisted Deep Reinforcement Learning for Online Admission Control in Sliced Network

Zhenyu Tao, Wei Xu, Xiaohu You

The proliferation of diverse wireless services in 5G and beyond has led to the emergence of network slicing technologies. Among these, admission control plays a crucial role in achieving service-oriented optimization goals through the selective acceptance of service requests. Although deep reinforcement learning (DRL) forms the foundation in many admission control approaches thanks to its effectiveness and flexibility, initial instability with excessive convergence delay of DRL models hinders their deployment in real-world networks. We propose a digital twin (DT) accelerated DRL solution to address this issue. Specifically, we first formulate the admission decision-making process as a semi-Markov decision process, which is subsequently simplified into an equivalent discrete-time Markov decision process to facilitate the implementation of DRL methods. A neural network-based DT is established with a customized output layer for queuing systems, trained through supervised learning, and then employed to assist the training phase of the DRL model. Extensive simulations show that the DT-accelerated DRL improves resource utilization by over 40% compared to the directly trained state-of-the-art dueling deep Q-learning model. This improvement is achieved while preserving the model's capability to optimize the long-term rewards of the admission process.

ITApr 30, 2023
Self-information Domain-based Neural CSI Compression with Feature Coupling

Ziqing Yin, Renjie Xie, Wei Xu et al.

Deep learning (DL)-based channel state information (CSI) feedback methods compressed the CSI matrix by exploiting its delay and angle features straightforwardly, while the measure in terms of information contained in the CSI matrix has rarely been considered. Based on this observation, we introduce self-information as an informative CSI representation from the perspective of information theory, which reflects the amount of information of the original CSI matrix in an explicit way. Then, a novel DL-based network is proposed for temporal CSI compression in the self-information domain, namely SD-CsiNet. The proposed SD-CsiNet projects the raw CSI onto a self-information matrix in the newly-defined self-information domain, extracts both temporal and spatial features of the self-information matrix, and then couples these two features for effective compression. Experimental results verify the effectiveness of the proposed SD-CsiNet by exploiting the self-information of CSI. Particularly for compression ratios 1/8 and 1/16, the SD-CsiNet respectively achieves 7.17 dB and 3.68 dB performance gains compared to state-of-the-art methods.

96.2ITApr 19
Node-Based Soft-Output Fast Successive Cancellation List Decoding of Polar Codes

Li Shen, Yongpeng Wu, Zhen Gao et al.

The soft-output successive cancellation list (SO-SCL) decoder provides a methodology for estimating the a-posteriori probability log-likelihood ratios by only leveraging the conventional SCL decoder of polar codes. However, the sequential decoding nature of SCL introduces high decoding latency to SO-SCL. In this paper, we incorporate node-based fast decoding into the SO-SCL framework. After addressing the challenge of soft output extraction in special node decoding, we proposed the soft-output fast SCL (SO-FSCL) decoding algorithm, along with its log-domain implementation and hardware-friendly version. The proposed SO-FSCL decoder can be regarded as an add-on extension to FSCL decoder, enabling us to autonomously choose whether to output only hard decisions like FSCL or to provide additional soft outputs. Latency and complexity analyses demonstrate that SO-FSCL can significantly reduce, for example, decoding time steps by 81.8\% (with unlimited resources), the number of additions by 41.3\%, and the number of comparisons by 46.4\%. Meanwhile, simulation results indicate that SO-FSCL delivers almost the same soft-output performance as SO-SCL, outperforming other soft-output polar decoders, especially in scenarios involving iterative decoding.

AIJul 29, 2024
Map2Traj: Street Map Piloted Zero-shot Trajectory Generation with Diffusion Model

Zhenyu Tao, Wei Xu, Xiaohu You

User mobility modeling serves a crucial role in analysis and optimization of contemporary wireless networks. Typical stochastic mobility models, e.g., random waypoint model and Gauss Markov model, can hardly capture the distribution characteristics of users within real-world areas. State-of-the-art trace-based mobility models and existing learning-based trajectory generation methods, however, are frequently constrained by the inaccessibility of substantial real trajectories due to privacy concerns. In this paper, we harness the intrinsic correlation between street maps and trajectories and develop a novel zero-shot trajectory generation method, named Map2Traj, by exploiting the diffusion model. We incorporate street maps as a condition to consistently pilot the denoising process and train our model on diverse sets of real trajectories from various regions in Xi'an, China, and their corresponding street maps. With solely the street map of an unobserved area, Map2Traj generates synthetic trajectories that not only closely resemble the real-world mobility pattern but also offer comparable efficacy. Extensive experiments validate the efficacy of our proposed method on zero-shot trajectory generation tasks in terms of both trajectory and distribution similarities. In addition, a case study of employing Map2Traj in wireless network optimization is presented to validate its efficacy for downstream applications.

LGDec 19, 2025
A Theoretical Analysis of State Similarity Between Markov Decision Processes

Zhenyu Tao, Wei Xu, Xiaohu You

The bisimulation metric (BSM) is a powerful tool for analyzing state similarities within a Markov decision process (MDP), revealing that states closer in BSM have more similar optimal value functions. While BSM has been successfully utilized in reinforcement learning (RL) for tasks like state representation learning and policy exploration, its application to state similarity between multiple MDPs remains challenging. Prior work has attempted to extend BSM to pairs of MDPs, but a lack of well-established mathematical properties has limited further theoretical analysis between MDPs. In this work, we formally establish a generalized bisimulation metric (GBSM) for measuring state similarity between arbitrary pairs of MDPs, which is rigorously proven with three fundamental metric properties, i.e., GBSM symmetry, inter-MDP triangle inequality, and a distance bound on identical spaces. Leveraging these properties, we theoretically analyze policy transfer, state aggregation, and sampling-based estimation across MDPs, obtaining explicit bounds that are strictly tighter than existing ones derived from the standard BSM. Additionally, GBSM provides a closed-form sample complexity for estimation, improving upon existing asymptotic results based on BSM. Numerical results validate our theoretical findings and demonstrate the effectiveness of GBSM in multi-MDP scenarios.

SPDec 4, 2025
Towards 6G Native-AI Edge Networks: A Semantic-Aware and Agentic Intelligence Paradigm

Chenyuan Feng, Anbang Zhang, Geyong Min et al.

The evolution toward sixth-generation wireless systems positions intelligence as a native network capability, fundamentally transforming the design of radio access networks (RANs). Within this vision, Semantic-native communication and agentic intelligence are expected to play central roles. SemCom departs from bit-level fidelity and instead emphasizes task-oriented meaning exchange, enabling compact SC and introducing new performance measures such as semantic fidelity and task success rate. Agentic intelligence endows distributed RAN entities with goal-driven autonomy, reasoning, planning, and multi-agent collaboration, increasingly supported by foundation models and knowledge graphs. In this work, we first introduce the conceptual foundations of SemCom and agentic networking, and discuss why existing AI-driven O-RAN solutions remain largely bit-centric and task-siloed. We then present a unified taxonomy that organizes recent research along three axes: i) semantic abstraction level (symbol/feature/intent/knowledge), ii) agent autonomy and coordination granularity (single-, multi-, and hierarchical-agent), and iii) RAN control placement across PHY/MAC, near-real-time RIC, and non-real-time RIC. Based on this taxonomy, we systematically introduce enabling technologies including task-oriented semantic encoders/decoders, multi-agent reinforcement learning, foundation-model-assisted RAN agents, and knowledge-graph-based reasoning for cross-layer awareness. Representative 6G use cases, such as immersive XR, vehicular V2X, and industrial digital twins, are analyzed to illustrate the semantic-agentic convergence in practice. Finally, we identify open challenges in semantic representation standardization, scalable trustworthy agent coordination, O-RAN interoperability, and energy-efficient AI deployment, and outline research directions toward operational semantic-agentic AI-RAN.

NIDec 19, 2024
Overview of AI and Communication for 6G Network: Fundamentals, Challenges, and Future Research Opportunities

Qimei Cui, Xiaohu You, Ni Wei et al.

With the growing demand for seamless connectivity and intelligent communication, the integration of artificial intelligence (AI) and sixth-generation (6G) communication networks has emerged as a transformative paradigm. By embedding AI capabilities across various network layers, this integration enables optimized resource allocation, improved efficiency, and enhanced system robust performance, particularly in intricate and dynamic environments. This paper presents a comprehensive overview of AI and communication for 6G networks, with a focus on emphasizing their foundational principles, inherent challenges, and future research opportunities. We first review the integration of AI and communications in the context of 6G, exploring the driving factors behind incorporating AI into wireless communications, as well as the vision for the convergence of AI and 6G. The discourse then transitions to a detailed exposition of the envisioned integration of AI within 6G networks, delineated across three progressive developmental stages. The first stage, AI for Network, focuses on employing AI to augment network performance, optimize efficiency, and enhance user service experiences. The second stage, Network for AI, highlights the role of the network in facilitating and buttressing AI operations and presents key enabling technologies, such as digital twins for AI and semantic communication. In the final stage, AI as a Service, it is anticipated that future 6G networks will innately provide AI functions as services, supporting application scenarios like immersive communication and intelligent industrial robots. In addition, we conduct an in-depth analysis of the critical challenges faced by the integration of AI and communications in 6G. Finally, we outline promising future research opportunities that are expected to drive the development and refinement of AI and 6G communications.

ITMar 9, 2024
Large Generative Model Assisted 3D Semantic Communication

Feibo Jiang, Yubo Peng, Li Dong et al.

Semantic Communication (SC) is a novel paradigm for data transmission in 6G. However, there are several challenges posed when performing SC in 3D scenarios: 1) 3D semantic extraction; 2) Latent semantic redundancy; and 3) Uncertain channel estimation. To address these issues, we propose a Generative AI Model assisted 3D SC (GAM-3DSC) system. Firstly, we introduce a 3D Semantic Extractor (3DSE), which employs generative AI models, including Segment Anything Model (SAM) and Neural Radiance Field (NeRF), to extract key semantics from a 3D scenario based on user requirements. The extracted 3D semantics are represented as multi-perspective images of the goal-oriented 3D object. Then, we present an Adaptive Semantic Compression Model (ASCM) for encoding these multi-perspective images, in which we use a semantic encoder with two output heads to perform semantic encoding and mask redundant semantics in the latent semantic space, respectively. Next, we design a conditional Generative adversarial network and Diffusion model aided-Channel Estimation (GDCE) to estimate and refine the Channel State Information (CSI) of physical channels. Finally, simulation results demonstrate the advantages of the proposed GAM-3DSC system in effectively transmitting the goal-oriented 3D scenario.

ITDec 17, 2024
Distributed satellite information networks: Architecture, enabling technologies, and trends

Qinyu Zhang, Liang Xu, Jianhao Huang et al.

Driven by the vision of ubiquitous connectivity and wireless intelligence, the evolution of ultra-dense constellation-based satellite-integrated Internet is underway, now taking preliminary shape. Nevertheless, the entrenched institutional silos and limited, nonrenewable heterogeneous network resources leave current satellite systems struggling to accommodate the escalating demands of next-generation intelligent applications. In this context, the distributed satellite information networks (DSIN), exemplified by the cohesive clustered satellites system, have emerged as an innovative architecture, bridging information gaps across diverse satellite systems, such as communication, navigation, and remote sensing, and establishing a unified, open information network paradigm to support resilient space information services. This survey first provides a profound discussion about innovative network architectures of DSIN, encompassing distributed regenerative satellite network architecture, distributed satellite computing network architecture, and reconfigurable satellite formation flying, to enable flexible and scalable communication, computing and control. The DSIN faces challenges from network heterogeneity, unpredictable channel dynamics, sparse resources, and decentralized collaboration frameworks. To address these issues, a series of enabling technologies is identified, including channel modeling and estimation, cloud-native distributed MIMO cooperation, grant-free massive access, network routing, and the proper combination of all these diversity techniques. Furthermore, to heighten the overall resource efficiency, the cross-layer optimization techniques are further developed to meet upper-layer deterministic, adaptive and secure information services requirements. In addition, emerging research directions and new opportunities are highlighted on the way to achieving the DSIN vision.

87.4ITApr 23
Spatiotemporal 2-D Polar Codes over Non-Uniform MIMO Channels: A Reliability-Aware Construction Approach

Yaqi Li, Shuohan Zhang, Xiaohu You et al.

With the increasing demand for ultra-reliable and low-latency communication (URLLC), spatiotemporal two-dimensional (2-D) channel coding has received growing interest. By leveraging the spatial degrees of freedom in massive multiple-input multiple-output (MIMO) systems, it shortens the time-domain blocklength, thereby reducing latency and enhancing reliability. However, existing spatiotemporal coding schemes typically assume uniform reliability across spatial streams. This assumption does not hold in practical MIMO channels, where the underlying propagation environment generally leads to unequal spatial-eigenmode gains and reliabilities, making the conventional Gaussian-approximation-based construction for 2-D polar codes less effective. This paper investigates spatiotemporal 2-D polar coding over non-uniform MIMO channels, where the spatial domain exhibits inherently heterogeneous signal-to-noise ratios (SNRs). We propose a reciprocal channel approximation (RCA)-based reliability-aware 2-D polar coding framework that accurately characterizes such heterogeneous SNRs without relying on log-likelihood-ratio distribution assumptions. Simulation results demonstrate that the proposed RCA-based spatiotemporal 2-D polar coding scheme achieves clear performance gains and strong robustness, confirming its effectiveness in jointly exploiting temporal and spatial polarization for URLLC in practical MIMO systems.

ITNov 6, 2024
Large Generative Model-assisted Talking-face Semantic Communication System

Feibo Jiang, Siwei Tu, Li Dong et al.

The rapid development of generative Artificial Intelligence (AI) continually unveils the potential of Semantic Communication (SemCom). However, current talking-face SemCom systems still encounter challenges such as low bandwidth utilization, semantic ambiguity, and diminished Quality of Experience (QoE). This study introduces a Large Generative Model-assisted Talking-face Semantic Communication (LGM-TSC) System tailored for the talking-face video communication. Firstly, we introduce a Generative Semantic Extractor (GSE) at the transmitter based on the FunASR model to convert semantically sparse talking-face videos into texts with high information density. Secondly, we establish a private Knowledge Base (KB) based on the Large Language Model (LLM) for semantic disambiguation and correction, complemented by a joint knowledge base-semantic-channel coding scheme. Finally, at the receiver, we propose a Generative Semantic Reconstructor (GSR) that utilizes BERT-VITS2 and SadTalker models to transform text back into a high-QoE talking-face video matching the user's timbre. Simulation results demonstrate the feasibility and effectiveness of the proposed LGM-TSC system.

ITFeb 15, 2024
Digital versus Analog Transmissions for Federated Learning over Wireless Networks

Jiacheng Yao, Wei Xu, Zhaohui Yang et al.

In this paper, we quantitatively compare these two effective communication schemes, i.e., digital and analog ones, for wireless federated learning (FL) over resource-constrained networks, highlighting their essential differences as well as their respective application scenarios. We first examine both digital and analog transmission methods, together with a unified and fair comparison scheme under practical constraints. A universal convergence analysis under various imperfections is established for FL performance evaluation in wireless networks. These analytical results reveal that the fundamental difference between the two paradigms lies in whether communication and computation are jointly designed or not. The digital schemes decouple the communication design from specific FL tasks, making it difficult to support simultaneous uplink transmission of massive devices with limited bandwidth. In contrast, the analog communication allows over-the-air computation (AirComp), thus achieving efficient spectrum utilization. However, computation-oriented analog transmission reduces power efficiency, and its performance is sensitive to computational errors. Finally, numerical simulations are conducted to verify these theoretical observations.

NIApr 16, 2024
Learning Wireless Data Knowledge Graph for Green Intelligent Communications: Methodology and Experiments

Yongming Huang, Xiaohu You, Hang Zhan et al.

Intelligent communications have played a pivotal role in shaping the evolution of 6G networks. Native artificial intelligence (AI) within green communication systems must meet stringent real-time requirements. To achieve this, deploying lightweight and resource-efficient AI models is necessary. However, as wireless networks generate a multitude of data fields and indicators during operation, only a fraction of them imposes significant impact on the network AI models. Therefore, real-time intelligence of communication systems heavily relies on a small but critical set of the data that profoundly influences the performance of network AI models. These challenges underscore the need for innovative architectures and solutions. In this paper, we propose a solution, termed the pervasive multi-level (PML) native AI architecture, which integrates the concept of knowledge graph (KG) into the intelligent operational manipulations of mobile networks, resulting in the establishment of a wireless data KG. Leveraging the wireless data KG, we characterize the massive and complex data collected from wireless communication networks and analyze the relationships among various data fields. The obtained graph of data field relations enables the on-demand generation of minimal and effective datasets, referred to as feature datasets, tailored to specific application requirements. Consequently, this architecture not only enhances AI training, inference, and validation processes but also significantly reduces resource wastage and overhead for communication networks. To implement this architecture, we have developed a specific solution comprising a spatio-temporal heterogeneous graph attention neural network model (STREAM) as well as a feature dataset generation algorithm. Experiments are conducted to validate the effectiveness of the proposed architecture.

SPDec 14, 2024
Model-driven deep neural network for enhanced direction finding with commodity 5G gNodeB

Shengheng Liu, Zihuan Mao, Xingkang Li et al.

Pervasive and high-accuracy positioning has become increasingly important as a fundamental enabler for intelligent connected devices in mobile networks. Nevertheless, current wireless networks heavily rely on pure model-driven techniques to achieve positioning functionality, often succumbing to performance deterioration due to hardware impairments in practical scenarios. Here we reformulate the direction finding or angle-of-arrival (AoA) estimation problem as an image recovery task of the spatial spectrum and propose a new model-driven deep neural network (MoD-DNN) framework. The proposed MoD-DNN scheme comprises three modules: a multi-task autoencoder-based beamformer, a coarray spectrum generation module, and a model-driven deep learning-based spatial spectrum reconstruction module. Our technique enables automatic calibration of angular-dependent phase error thereby enhancing the resilience of direction-finding precision against realistic system non-idealities. We validate the proposed scheme both using numerical simulations and field tests. The results show that the proposed MoD-DNN framework enables effective spectrum calibration and accurate AoA estimation. To the best of our knowledge, this study marks the first successful demonstration of hybrid data-and-model-driven direction finding utilizing readily available commodity 5G gNodeB.

LGFeb 25, 2025
Provable Performance Bounds for Digital Twin-driven Deep Reinforcement Learning in Wireless Networks: A Novel Digital-Twin Bisimulation Metric

Zhenyu Tao, Wei Xu, Xiaohu You

Digital twin (DT)-driven deep reinforcement learning (DRL) has emerged as a promising paradigm for wireless network optimization, offering safe and efficient training environment for policy exploration. However, in theory existing methods cannot always guarantee real-world performance of DT-trained policies before actual deployment, due to the absence of a universal metric for assessing DT's ability to support reliable DRL training transferrable to physical networks. In this paper, we propose the DT bisimulation metric (DT-BSM), a novel metric based on the Wasserstein distance, to quantify the discrepancy between Markov decision processes (MDPs) in both the DT and the corresponding real-world wireless network environment. We prove that for any DT-trained policy, the sub-optimality of its performance (regret) in the real-world deployment is bounded by a weighted sum of the DT-BSM and its sub-optimality within the MDP in the DT. Then, a modified DT-BSM based on the total variation distance is also introduced to avoid the prohibitive calculation complexity of Wasserstein distance for large-scale wireless network scenarios. Further, to tackle the challenge of obtaining accurate transition probabilities of the MDP in real world for the DT-BSM calculation, we propose an empirical DT-BSM method based on statistical sampling. We prove that the empirical DT-BSM always converges to the desired theoretical one, and quantitatively establish the relationship between the required sample size and the target level of approximation accuracy. Numerical experiments validate this first theoretical finding on the provable and calculable performance bounds for DT-driven DRL.

LGSep 23, 2025
A Generalized Bisimulation Metric of State Similarity between Markov Decision Processes: From Theoretical Propositions to Applications

Zhenyu Tao, Wei Xu, Xiaohu You

The bisimulation metric (BSM) is a powerful tool for computing state similarities within a Markov decision process (MDP), revealing that states closer in BSM have more similar optimal value functions. While BSM has been successfully utilized in reinforcement learning (RL) for tasks like state representation learning and policy exploration, its application to multiple-MDP scenarios, such as policy transfer, remains challenging. Prior work has attempted to generalize BSM to pairs of MDPs, but a lack of rigorous analysis of its mathematical properties has limited further theoretical progress. In this work, we formally establish a generalized bisimulation metric (GBSM) between pairs of MDPs, which is rigorously proven with the three fundamental properties: GBSM symmetry, inter-MDP triangle inequality, and the distance bound on identical state spaces. Leveraging these properties, we theoretically analyse policy transfer, state aggregation, and sampling-based estimation in MDPs, obtaining explicit bounds that are strictly tighter than those derived from the standard BSM. Additionally, GBSM provides a closed-form sample complexity for estimation, improving upon existing asymptotic results based on BSM. Numerical results validate our theoretical findings and demonstrate the effectiveness of GBSM in multi-MDP scenarios.

AISep 3, 2023
Large AI Model Empowered Multimodal Semantic Communications

Feibo Jiang, Li Dong, Yubo Peng et al.

Multimodal signals, including text, audio, image, and video, can be integrated into Semantic Communication (SC) systems to provide an immersive experience with low latency and high quality at the semantic level. However, the multimodal SC has several challenges, including data heterogeneity, semantic ambiguity, and signal distortion during transmission. Recent advancements in large AI models, particularly in the Multimodal Language Model (MLM) and Large Language Model (LLM), offer potential solutions for addressing these issues. To this end, we propose a Large AI Model-based Multimodal SC (LAM-MSC) framework, where we first present the MLM-based Multimodal Alignment (MMA) that utilizes the MLM to enable the transformation between multimodal and unimodal data while preserving semantic consistency. Then, a personalized LLM-based Knowledge Base (LKB) is proposed, which allows users to perform personalized semantic extraction or recovery through the LLM. This effectively addresses the semantic ambiguity. Finally, we apply the Conditional Generative adversarial network-based channel Estimation (CGE) for estimating the wireless channel state information. This approach effectively mitigates the impact of fading channels in SC. Finally, we conduct simulations that demonstrate the superior performance of the LAM-MSC framework.

NINov 26, 2020
True-data Testbed for 5G/B5G Intelligent Network

Yongming Huang, Shengheng Liu, Cheng Zhang et al.

Future beyond fifth-generation (B5G) and sixth-generation (6G) mobile communications will shift from facilitating interpersonal communications to supporting Internet of Everything (IoE), where intelligent communications with full integration of big data and artificial intelligence (AI) will play an important role in improving network efficiency and providing high-quality service. As a rapid evolving paradigm, the AI-empowered mobile communications demand large amounts of data acquired from real network environment for systematic test and verification. Hence, we build the world's first true-data testbed for 5G/B5G intelligent network (TTIN), which comprises 5G/B5G on-site experimental networks, data acquisition & data warehouse, and AI engine & network optimization. In the TTIN, true network data acquisition, storage, standardization, and analysis are available, which enable system-level online verification of B5G/6G-orientated key technologies and support data-driven network optimization through the closed-loop control mechanism. This paper elaborates on the system architecture and module design of TTIN. Detailed technical specifications and some of the established use cases are also showcased.

LGFeb 27, 2019
Distributed Edge Caching via Reinforcement Learning in Fog Radio Access Networks

Liuyang Lu, Yanxiang Jiang, Mehdi Bennis et al.

In this paper, the distributed edge caching problem in fog radio access networks (F-RANs) is investigated. By considering the unknown spatio-temporal content popularity and user preference, a user request model based on hidden Markov process is proposed to characterize the fluctuant spatio-temporal traffic demands in F-RANs. Then, the Q-learning method based on the reinforcement learning (RL) framework is put forth to seek the optimal caching policy in a distributed manner, which enables fog access points (F-APs) to learn and track the potential dynamic process without extra communications cost. Furthermore, we propose a more efficient Q-learning method with value function approximation (Q-VFA-learning) to reduce complexity and accelerate convergence. Simulation results show that the performance of our proposed method is superior to those of the traditional methods.

SPNov 24, 2018
Polar Decoding on Sparse Graphs with Deep Learning

Weihong Xu, Xiaohu You, Chuan Zhang et al.

In this paper, we present a sparse neural network decoder (SNND) of polar codes based on belief propagation (BP) and deep learning. At first, the conventional factor graph of polar BP decoding is converted to the bipartite Tanner graph similar to low-density parity-check (LDPC) codes. Then the Tanner graph is unfolded and translated into the graphical representation of deep neural network (DNN). The complex sum-product algorithm (SPA) is modified to min-sum (MS) approximation with low complexity. We dramatically reduce the number of weight by using single weight to parameterize the networks. Optimized by the training techniques of deep learning, proposed SNND achieves comparative decoding performance of SPA and obtains about $0.5$ dB gain over MS decoding on ($128,64$) and ($256,128$) codes. Moreover, $60 \%$ complexity reduction is achieved and the decoding latency is significantly lower than the conventional polar BP.

SPApr 2, 2018
Improving Massive MIMO Belief Propagation Detector with Deep Neural Network

Xiaosi Tan, Weihong Xu, Yair Be'ery et al.

In this paper, deep neural network (DNN) is utilized to improve the belief propagation (BP) detection for massive multiple-input multiple-output (MIMO) systems. A neural network architecture suitable for detection task is firstly introduced by unfolding BP algorithms. DNN MIMO detectors are then proposed based on two modified BP detectors, damped BP and max-sum BP. The correction factors in these algorithms are optimized through deep learning techniques, aiming at improved detection performance. Numerical results are presented to demonstrate the performance of the DNN detectors in comparison with various BP modifications. The neural network is trained once and can be used for multiple online detections. The results show that, compared to other state-of-the-art detectors, the DNN detectors can achieve lower bit error rate (BER) with improved robustness against various antenna configurations and channel conditions at the same level of complexity.