Feifei Gao

IT
h-index24
22papers
1,669citations
Novelty47%
AI Score57

22 Papers

CVAug 8, 2022
Towards Semantic Communications: Deep Learning-Based Image Semantic Coding

Danlan Huang, Feifei Gao, Xiaoming Tao et al.

Semantic communications has received growing interest since it can remarkably reduce the amount of data to be transmitted without missing critical information. Most existing works explore the semantic encoding and transmission for text and apply techniques in Natural Language Processing (NLP) to interpret the meaning of the text. In this paper, we conceive the semantic communications for image data that is much more richer in semantics and bandwidth sensitive. We propose an reinforcement learning based adaptive semantic coding (RL-ASC) approach that encodes images beyond pixel level. Firstly, we define the semantic concept of image data that includes the category, spatial arrangement, and visual feature as the representation unit, and propose a convolutional semantic encoder to extract semantic concepts. Secondly, we propose the image reconstruction criterion that evolves from the traditional pixel similarity to semantic similarity and perceptual performance. Thirdly, we design a novel RL-based semantic bit allocation model, whose reward is the increase in rate-semantic-perceptual performance after encoding a certain semantic concept with adaptive quantization level. Thus, the task-related information is preserved and reconstructed properly while less important data is discarded. Finally, we propose the Generative Adversarial Nets (GANs) based semantic decoder that fuses both locally and globally features via an attention module. Experimental results demonstrate that the proposed RL-ASC is noise robust and could reconstruct visually pleasant and semantic consistent image, and saves times of bit cost compared to standard codecs and other deep learning-based image codecs.

SYSep 23, 2017
Beam Tracking for UAV Mounted SatCom on-the-Move with Massive Antenna Array

Jianwei Zhao, Feifei Gao, Qihui Wu et al.

Unmanned aerial vehicle (UAV)-satellite communication has drawn dramatic attention for its potential to build the integrated space-air-ground network and the seamless wide-area coverage. The key challenge to UAV-satellite communication is its unstable beam pointing due to the UAV navigation, which is a typical SatCom on-the-move scenario. In this paper, we propose a blind beam tracking approach for Ka-band UAVsatellite communication system, where UAV is equipped with a large-scale antenna array. The effects of UAV navigation are firstly released through the mechanical adjustment, which could approximately point the beam towards the target satellite through beam stabilization and dynamic isolation. Specially, the attitude information can be realtimely derived from data fusion of lowcost sensors. Then, the precision of the beam pointing is blindly refined through electrically adjusting the weight of the massive antennas, where an array structure based simultaneous perturbation algorithm is designed. Simulation results are provided to demonstrate the superiority of the proposed method over the existing ones.

SPMar 2, 2023
Pay Less But Get More: A Dual-Attention-based Channel Estimation Network for Massive MIMO Systems with Low-Density Pilots

Binggui Zhou, Xi Yang, Shaodan Ma et al.

To reap the promising benefits of massive multiple-input multiple-output (MIMO) systems, accurate channel state information (CSI) is required through channel estimation. However, due to the complicated wireless propagation environment and large-scale antenna arrays, precise channel estimation for massive MIMO systems is significantly challenging and costs an enormous training overhead. Considerable time-frequency resources are consumed to acquire sufficient accuracy of CSI, which thus severely degrades systems' spectral and energy efficiencies. In this paper, we propose a dual-attention-based channel estimation network (DACEN) to realize accurate channel estimation via low-density pilots, by jointly learning the spatial-temporal domain features of massive MIMO channels with the temporal attention module and the spatial attention module. To further improve the estimation accuracy, we propose a parameter-instance transfer learning approach to transfer the channel knowledge learned from the high-density pilots pre-acquired during the training dataset collection period. Experimental results reveal that the proposed DACEN-based method achieves better channel estimation performance than the existing methods under various pilot-density settings and signal-to-noise ratios. Additionally, with the proposed parameter-instance transfer learning approach, the DACEN-based method achieves additional performance gain, thereby further demonstrating the effectiveness and superiority of the proposed method.

ITJan 2, 2023
Model-Driven Deep Learning for Non-Coherent Massive Machine-Type Communications

Zhe Ma, Wen Wu, Feifei Gao et al.

In this paper, we investigate the joint device activity and data detection in massive machine-type communications (mMTC) with a one-phase non-coherent scheme, where data bits are embedded in the pilot sequences and the base station simultaneously detects active devices and their embedded data bits without explicit channel estimation. Due to the correlated sparsity pattern introduced by the non-coherent transmission scheme, the traditional approximate message passing (AMP) algorithm cannot achieve satisfactory performance. Therefore, we propose a deep learning (DL) modified AMP network (DL-mAMPnet) that enhances the detection performance by effectively exploiting the pilot activity correlation. The DL-mAMPnet is constructed by unfolding the AMP algorithm into a feedforward neural network, which combines the principled mathematical model of the AMP algorithm with the powerful learning capability, thereby benefiting from the advantages of both techniques. Trainable parameters are introduced in the DL-mAMPnet to approximate the correlated sparsity pattern and the large-scale fading coefficient. Moreover, a refinement module is designed to further advance the performance by utilizing the spatial feature caused by the correlated sparsity pattern. Simulation results demonstrate that the proposed DL-mAMPnet can significantly outperform traditional algorithms in terms of the symbol error rate performance.

LGMar 8, 2022
Cluster Head Detection for Hierarchical UAV Swarm With Graph Self-supervised Learning

Zhiyu Mou, Jun Liu, Xiang Yun et al.

In this paper, we study the cluster head detection problem of a two-level unmanned aerial vehicle (UAV) swarm network (USNET) with multiple UAV clusters, where the inherent follow strategy (IFS) of low-level follower UAVs (FUAVs) with respect to high-level cluster head UAVs (HUAVs) is unknown. We first propose a graph attention self-supervised learning algorithm (GASSL) to detect the HUAVs of a single UAV cluster, where the GASSL can fit the IFS at the same time. Then, to detect the HUAVs in the USNET with multiple UAV clusters, we develop a multi-cluster graph attention self-supervised learning algorithm (MC-GASSL) based on the GASSL. The MC-GASSL clusters the USNET with a gated recurrent unit (GRU)-based metric learning scheme and finds the HUAVs in each cluster with GASSL. Numerical results show that the GASSL can detect the HUAVs in single UAV clusters obeying various kinds of IFSs with over 98% average accuracy. The simulation results also show that the clustering purity of the USNET with MC-GASSL exceeds that with traditional clustering algorithms by at least 10% average. Furthermore, the MC-GASSL can efficiently detect all the HUAVs in USNETs with various IFSs and cluster numbers with low detection redundancies.

37.8ITApr 7
Wireless Large AI Model: Shaping the AI-Native Future of 6G and Beyond

Fenghao Zhu, Xinquan Wang, Siming Jiang et al.

The emergence of sixth-generation and beyond communication systems is expected to fundamentally transform digital experiences through introducing unparalleled levels of intelligence, efficiency, and connectivity. A promising technology poised to enable this revolutionary vision is a wireless large AI model (WLAM), characterized by its exceptional capabilities in data processing, inference, and decision-making. In light of these remarkable capabilities, this paper provides a comprehensive survey of WLAM, explaining its fundamental principles, diverse applications, critical challenges, and future research opportunities. We begin by introducing the background of WLAM and analyzing the key synergies with wireless networks, emphasizing the mutual benefits. Subsequently, we explore the foundational characteristics of WLAM, delving into their unique relevance in wireless environments. Then, the role of WLAM in optimizing wireless communication systems across various use cases and the reciprocal benefits are systematically investigated. Furthermore, we discuss the integration of WLAM with emerging technologies, highlighting their potential to enable transformative capabilities and breakthroughs in wireless communication. Finally, we thoroughly examine the high-level challenges and discuss pivotal future research directions.

NAJan 30, 2023
Deep learning numerical methods for high-dimensional fully nonlinear PIDEs and coupled FBSDEs with jumps

Wansheng Wang, Jie Wang, Jinping Li et al.

We propose a deep learning algorithm for solving high-dimensional parabolic integro-differential equations (PIDEs) and high-dimensional forward-backward stochastic differential equations with jumps (FBSDEJs), where the jump-diffusion process are derived by a Brownian motion and an independent compensated Poisson random measure. In this novel algorithm, a pair of deep neural networks for the approximations of the gradient and the integral kernel is introduced in a crucial way based on deep FBSDE method. To derive the error estimates for this deep learning algorithm, the convergence of Markovian iteration, the error bound of Euler time discretization, and the simulation error of deep learning algorithm are investigated. Two numerical examples are provided to show the efficiency of this proposed algorithm.

45.9ROMay 7Code
VLA-GSE: Boosting Parameter-Efficient Fine-Tuning in VLA with Generalized and Specialized Experts

Yuhua Jiang, Junjie Lu, Xinyao Qin et al.

Vision-language-action (VLA) models inherit rich visual-semantic priors from pre-trained vision-language backbones, but adapting them to robotic control remains challenging. Full fine-tuning (FFT) is prone to overfitting on downstream robotic data and catastrophic forgetting of pretrained vision-language capabilities. Parameter-efficient fine-tuning (PEFT) better preserves pre-trained knowledge, yet existing PEFT methods still struggle to adapt effectively to robot control tasks. To address this gap, we propose VLA-GSE, a parameter-efficient VLA fine-tuning framework that improves control adaptation while retaining PEFT's knowledge preservation advantage. Specifically, VLA-GSE (Generalized and Specialized Experts) is initialized by spectrally decomposing the frozen backbone, assigning leading singular components to generalized experts (shared experts) and disjoint residual components to specialized experts (routed experts). This decomposition improves adaptation capacity under a fixed trainable-parameter budget. Under a comparable parameter budget, VLA-GSE updates only 2.51% of the full model parameters and consistently outperforms strong FFT and PEFT baselines. It achieves 81.2% average zero-shot success on LIBERO-Plus, preserves pre-trained VLM capability comparably to LoRA on multimodal understanding benchmarks, and improves real-world manipulation success under multiple distribution shifts. Code is available at: https://github.com/YuhuaJiang2002/VLA-GSE

49.4ROMay 7Code
CKT-WAM: Parameter-Efficient Context Knowledge Transfer Between World Action Models

Yuhua Jiang, Yijun Guo, Hongbing Yang et al.

World action models (WAMs) provide a powerful generative framework for embodied control, yet transferring knowledge across heterogeneous WAMs remains challenging due to mismatched latent interfaces, high adaptation cost, and the rigidity of conventional distillation objectives. We propose \textbf{CKT-WAM}, a parameter-efficient \textbf{C}ontext \textbf{K}nowledge \textbf{T}ransfer framework that transfers teacher WAM's knowledge into a student WAM through a compact context in the text embedding space, rather than output imitation or dense hidden-state matching. Specifically, CKT-WAM extracts intermediate teacher hidden states, reduces the number of tokens via compressors' learnable-query cross attention (LQCA), and transforms them through an always-on generalized adapter, a lightweight router, and sparsely activated specialized adapters. The resulting context is then appended to the student's conditioning textual embeddings, thereby injecting the transferred knowledge into the student with minimal architectural modification. Experiments show that CKT-WAM consistently improves zero-shot generalization and achieves the best overall performance on LIBERO-Plus, reaching 86.1\% total success rate with only 1.17\% trainable parameters, while approaching full fine-tuning performance. Beyond simulation, CKT-WAM also demonstrates strong real-world long-horizon manipulation ability, achieving the best average success rate of 83.3\% across four multi-step and long-horizon tasks. Code is available at https://github.com/YuhuaJiang2002/CKT-WAM.

LGOct 30, 2025
Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism

Yuhua Jiang, Shuang Cheng, Yihao Liu et al.

Specialized Generalist Models (SGMs) aim to preserve broad capabilities while achieving expert-level performance in target domains. However, traditional LLM structures including Transformer, Linear Attention, and hybrid models do not employ specialized memory mechanism guided by task information. In this paper, we present Nirvana, an SGM with specialized memory mechanism, linear time complexity, and test-time task information extraction. Besides, we propose the Task-Aware Memory Trigger ($\textit{Trigger}$) that flexibly adjusts memory mechanism based on the current task's requirements. In Trigger, each incoming sample is treated as a self-supervised fine-tuning task, enabling Nirvana to adapt its task-related parameters on the fly to domain shifts. We also design the Specialized Memory Updater ($\textit{Updater}$) that dynamically memorizes the context guided by Trigger. We conduct experiments on both general language tasks and specialized medical tasks. On a variety of natural language modeling benchmarks, Nirvana achieves competitive or superior results compared to the existing LLM structures. To prove the effectiveness of Trigger on specialized tasks, we test Nirvana's performance on a challenging medical task, i.e., Magnetic Resonance Imaging (MRI). We post-train frozen Nirvana backbone with lightweight codecs on paired electromagnetic signals and MRI images. Despite the frozen Nirvana backbone, Trigger guides the model to adapt to the MRI domain with the change of task-related parameters. Nirvana achieves higher-quality MRI reconstruction compared to conventional MRI models as well as the models with traditional LLMs' backbone, and can also generate accurate preliminary clinical reports accordingly.

RONov 18, 2025Code
AsyncVLA: Asynchronous Flow Matching for Vision-Language-Action Models

Yuhua Jiang, Shuang Cheng, Yan Ding et al.

Vision-language-action (VLA) models have recently emerged as a powerful paradigm for building generalist robots. However, traditional VLA models that generate actions through flow matching (FM) typically rely on rigid and uniform time schedules, i.e., synchronous FM (SFM). Without action context awareness and asynchronous self-correction, SFM becomes unstable in long-horizon tasks, where a single action error can cascade into failure. In this work, we propose asynchronous flow matching VLA (AsyncVLA), a novel framework that introduces temporal flexibility in asynchronous FM (AFM) and enables self-correction in action generation. AsyncVLA breaks from the vanilla SFM in VLA models by generating the action tokens in a non-uniform time schedule with action context awareness. Besides, our method introduces the confidence rater to extract confidence of the initially generated actions, enabling the model to selectively refine inaccurate action tokens before execution. Moreover, we propose a unified training procedure for SFM and AFM that endows a single model with both modes, improving KV-cache utilization. Extensive experiments on robotic manipulation benchmarks demonstrate that AsyncVLA is data-efficient and exhibits self-correction ability. AsyncVLA achieves state-of-the-art results across general embodied evaluations due to its asynchronous generation in AFM. Our code is available at https://github.com/YuhuaJiang2002/AsyncVLA.

ITDec 7, 2023
A Low-Overhead Incorporation-Extrapolation based Few-Shot CSI Feedback Framework for Massive MIMO Systems

Binggui Zhou, Xi Yang, Jintao Wang et al.

Accurate channel state information (CSI) is essential for downlink precoding in frequency division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems with orthogonal frequency-division multiplexing (OFDM). However, obtaining CSI through feedback from the user equipment (UE) becomes challenging with the increasing scale of antennas and subcarriers and leads to extremely high CSI feedback overhead. Deep learning-based methods have emerged for compressing CSI but these methods generally require substantial collected samples and thus pose practical challenges. Moreover, existing deep learning methods also suffer from dramatically growing feedback overhead owing to their focus on full-dimensional CSI feedback. To address these issues, we propose a low-overhead Incorporation-Extrapolation based Few-Shot CSI feedback Framework (IEFSF) for massive MIMO systems. An incorporation-extrapolation scheme for eigenvector-based CSI feedback is proposed to reduce the feedback overhead. Then, to alleviate the necessity of extensive collected samples and enable few-shot CSI feedback, we further propose a knowledge-driven data augmentation (KDDA) method and an artificial intelligence-generated content (AIGC) -based data augmentation method by exploiting the domain knowledge of wireless channels and by exploiting a novel generative model, respectively. Experimental results based on the DeepMIMO dataset demonstrate that the proposed IEFSF significantly reduces CSI feedback overhead by 64 times compared with existing methods while maintaining higher feedback accuracy using only several hundred collected samples.

SPJun 13, 2024
Low-Overhead Channel Estimation via 3D Extrapolation for TDD mmWave Massive MIMO Systems Under High-Mobility Scenarios

Binggui Zhou, Xi Yang, Shaodan Ma et al.

In time division duplexing (TDD) millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems, downlink channel state information (CSI) can be obtained from uplink channel estimation thanks to channel reciprocity. However, under high-mobility scenarios, frequent uplink channel estimation is needed due to channel aging. Additionally, large amounts of antennas and subcarriers result in high-dimensional CSI matrices, aggravating pilot training overhead. To address this, we propose a three-domain (3D) channel extrapolation framework across spatial, frequency, and temporal domains. First, considering the effectiveness of traditional knowledge-driven channel estimation methods and the marginal effects of pilots in the spatial and frequency domains, a knowledge-and-data driven spatial-frequency channel extrapolation network (KDD-SFCEN) is proposed for uplink channel estimation via joint spatial-frequency channel extrapolation to reduce spatial-frequency domain pilot overhead. Then, leveraging channel reciprocity and temporal dependencies, we propose a temporal uplink-downlink channel extrapolation network (TUDCEN) powered by generative artificial intelligence for slot-level channel extrapolation, aiming to reduce the tremendous temporal domain pilot overhead caused by high mobility. Numerical results demonstrate the superiority of the proposed framework in significantly reducing the pilot training overhead by 16 times and improving the system's spectral efficiency under high-mobility scenarios compared with state-of-the-art channel estimation/extrapolation methods.

SPJan 18, 2022
Data-Driven Deep Learning Based Hybrid Beamforming for Aerial Massive MIMO-OFDM Systems with Implicit CSI

Zhen Gao, Minghui Wu, Chun Hu et al.

In an aerial hybrid massive multiple-input multiple-output (MIMO) and orthogonal frequency division multiplexing (OFDM) system, how to design a spectral-efficient broadband multi-user hybrid beamforming with a limited pilot and feedback overhead is challenging. To this end, by modeling the key transmission modules as an end-to-end (E2E) neural network, this paper proposes a data-driven deep learning (DL)-based unified hybrid beamforming framework for both the time division duplex (TDD) and frequency division duplex (FDD) systems with implicit channel state information (CSI). For TDD systems, the proposed DL-based approach jointly models the uplink pilot combining and downlink hybrid beamforming modules as an E2E neural network. While for FDD systems, we jointly model the downlink pilot transmission, uplink CSI feedback, and downlink hybrid beamforming modules as an E2E neural network. Different from conventional approaches separately processing different modules, the proposed solution simultaneously optimizes all modules with the sum rate as the optimization object. Therefore, by perceiving the inherent property of air-to-ground massive MIMO-OFDM channel samples, the DL-based E2E neural network can establish the mapping function from the channel to the beamformer, so that the explicit channel reconstruction can be avoided with reduced pilot and feedback overhead. Besides, practical low-resolution phase shifters (PSs) introduce the quantization constraint, leading to the intractable gradient backpropagation when training the neural network. To mitigate the performance loss caused by the phase quantization error, we adopt the transfer learning strategy to further fine-tune the E2E neural network based on a pre-trained network that assumes the ideal infinite-resolution PSs. Numerical results show that our DL-based schemes have considerable advantages over state-of-the-art schemes.

SPJun 30, 2021
Resilient UAV Swarm Communications with Graph Convolutional Neural Network

Zhiyu Mou, Feifei Gao, Jun Liu et al.

In this paper, we study the self-healing problem of unmanned aerial vehicle (UAV) swarm network (USNET) that is required to quickly rebuild the communication connectivity under unpredictable external disruptions (UEDs). Firstly, to cope with the one-off UEDs, we propose a graph convolutional neural network (GCN) and find the recovery topology of the USNET in an on-line manner. Secondly, to cope with general UEDs, we develop a GCN based trajectory planning algorithm that can make UAVs rebuild the communication connectivity during the self-healing process. We also design a meta learning scheme to facilitate the on-line executions of the GCN. Numerical results show that the proposed algorithms can rebuild the communication connectivity of the USNET more quickly than the existing algorithms under both one-off UEDs and general UEDs. The simulation results also show that the meta learning scheme can not only enhance the performance of the GCN but also reduce the time complexity of the on-line executions.

SPMay 14, 2021
Deep Learning Based RIS Channel Extrapolation with Element-grouping

Shunbo Zhang, Shun Zhang, Feifei Gao et al.

Reconfigurable intelligent surface (RIS) is considered as a revolutionary technology for future wireless communication networks. In this letter, we consider the acquisition of the cascaded channels, which is a challenging task due to the massive number of passive RIS elements. To reduce the pilot overhead, we adopt the element-grouping strategy, where each element in one group shares the same reflection coefficient and is assumed to have the same channel condition. We analyze the channel interference caused by the element-grouping strategy and further design two deep learning based networks. The first one aims to refine the partial channels by eliminating the interference, while the second one tries to extrapolate the full channels from the refined partial channels. We cascade the two networks and jointly train them. Simulation results show that the proposed scheme provides significant gain compared to the conventional element-grouping method without interference elimination.

ITApr 22, 2021
Model-Driven Deep Learning Based Channel Estimation and Feedback for Millimeter-Wave Massive Hybrid MIMO Systems

Xisuo Ma, Zhen Gao, Feifei Gao et al.

This paper proposes a model-driven deep learning (MDDL)-based channel estimation and feedback scheme for wideband millimeter-wave (mmWave) massive hybrid multiple-input multiple-output (MIMO) systems, where the angle-delay domain channels' sparsity is exploited for reducing the overhead. Firstly, we consider the uplink channel estimation for time-division duplexing systems. To reduce the uplink pilot overhead for estimating the high-dimensional channels from a limited number of radio frequency (RF) chains at the base station (BS), we propose to jointly train the phase shift network and the channel estimator as an auto-encoder. Particularly, by exploiting the channels' structured sparsity from an a priori model and learning the integrated trainable parameters from the data samples, the proposed multiple-measurement-vectors learned approximate message passing (MMV-LAMP) network with the devised redundant dictionary can jointly recover multiple subcarriers' channels with significantly enhanced performance. Moreover, we consider the downlink channel estimation and feedback for frequency-division duplexing systems. Similarly, the pilots at the BS and channel estimator at the users can be jointly trained as an encoder and a decoder, respectively. Besides, to further reduce the channel feedback overhead, only the received pilots on part of the subcarriers are fed back to the BS, which can exploit the MMV-LAMP network to reconstruct the spatial-frequency channel matrix. Numerical results show that the proposed MDDL-based channel estimation and feedback scheme outperforms the state-of-the-art approaches.

SPSep 3, 2020
Deep Learning Based Antenna Selection for Channel Extrapolation in FDD Massive MIMO

Yindi Yang, Shun Zhang, Feifei Gao et al.

In massive multiple-input multiple-output (MIMO) systems, the large number of antennas would bring a great challenge for the acquisition of the accurate channel state information, especially in the frequency division duplex mode. To overcome the bottleneck of the limited number of radio links in hybrid beamforming, we utilize the neural networks (NNs) to capture the inherent connection between the uplink and downlink channel data sets and extrapolate the downlink channels from a subset of the uplink channel state information. We study the antenna subset selection problem in order to achieve the best channel extrapolation and decrease the data size of NNs. The probabilistic sampling theory is utilized to approximate the discrete antenna selection as a continuous and differentiable function, which makes the back propagation of the deep learning feasible. Then, we design the proper off-line training strategy to optimize both the antenna selection pattern and the extrapolation NNs. Finally, numerical results are presented to verify the effectiveness of our proposed massive MIMO channel extrapolation algorithm.

SPSep 3, 2020
Deep Learning Optimized Sparse Antenna Activation for Reconfigurable Intelligent Surface Assisted Communication

Shunbo Zhang, Shun Zhang, Feifei Gao et al.

To capture the communications gain of the massive radiating elements with low power cost, the conventional reconfigurable intelligent surface (RIS) usually works in passive mode. However, due to the cascaded channel structure and the lack of signal processing ability, it is difficult for RIS to obtain the individual channel state information and optimize the beamforming vector. In this paper, we add signal processing units for a few antennas at RIS to partially acquire the channels. To solve the crucial active antenna selection problem, we construct an active antenna selection network that utilizes the probabilistic sampling theory to select the optimal locations of these active antennas. With this active antenna selection network, we further design two deep learning (DL) based schemes, i.e., the channel extrapolation scheme and the beam searching scheme, to enable the RIS communication system. The former utilizes the selection network and a convolutional neural network to extrapolate the full channels from the partial channels received by the active RIS antennas, while the latter adopts a fully-connected neural network to achieve the direct mapping between the partial channels and the optimal beamforming vector with maximal transmission rate. Simulation results are provided to demonstrate the effectiveness of the designed DL-based schemes.

ITDec 27, 2019
Deep Transfer Learning Based Downlink Channel Prediction for FDD Massive MIMO Systems

Yuwen Yang, Feifei Gao, Zhimeng Zhong et al.

Artificial intelligence (AI) based downlink channel state information (CSI) prediction for frequency division duplexing (FDD) massive multiple-input multiple-output (MIMO) systems has attracted growing attention recently. However, existing works focus on the downlink CSI prediction for the users under a given environment and is hard to adapt to users in new environment especially when labeled data is limited. To address this issue, we formulate the downlink channel prediction as a deep transfer learning (DTL) problem, where each learning task aims to predict the downlink CSI from the uplink CSI for one single environment. Specifically, we develop the direct-transfer algorithm based on the fully-connected neural network architecture, where the network is trained on the data from all previous environments in the manner of classical deep learning and is then fine-tuned for new environments. To further improve the transfer efficiency, we propose the meta-learning algorithm that trains the network by alternating inner-task and across-task updates and then adapts to a new environment with a small number of labeled data. Simulation results show that the direct-transfer algorithm achieves better performance than the deep learning algorithm, which implies that the transfer learning benefits the downlink channel prediction in new environments. Moreover, the meta-learning algorithm significantly outperforms the direct-transfer algorithm in terms of both prediction accuracy and stability, which validates its effectiveness and superiority.

ITSep 29, 2019
Model-aided Deep Neural Network for Source Number Detection

Yuwen Yang, Feifei Gao, Cheng Qian et al.

Source number detection is a critical problem in array signal processing. Conventional model-driven methods e.g., Akaikes information criterion (AIC) and minimum description length (MDL), suffer from severe performance degradation when the number of snapshots is small or the signal-to-noise ratio (SNR) is low. In this paper, we exploit the model-aided based deep neural network (DNN) to estimate the source number. Specifically, we first propose the eigenvalue based regression network (ERNet) and classification network (ECNet) to estimate the number of non-coherent sources, where the eigenvalues of the received signal covariance matrix and the source number are used as the input and the supervise label of the networks, respectively. Then, we extend the ERNet and ECNet for estimating the number of coherent sources, where the forward-backward spatial smoothing (FBSS) scheme is adopted to improve the performance of ERNet and ECNet. Numerical results demonstrate the outstanding performance of ERNet and ECNet over the conventional AIC and MDL methods as well as their excellent generalization capability, which also shows their great potentials for practical applications.

ITSep 17, 2018
Model-Driven Deep Learning for Physical Layer Communications

Hengtao He, Shi Jin, Chao-Kai Wen et al.

Intelligent communication is gradually considered as the mainstream direction in future wireless communications. As a major branch of machine learning, deep learning (DL) has been applied in physical layer communications and has demonstrated an impressive performance improvement in recent years. However, most of the existing works related to DL focus on data-driven approaches, which consider the communication system as a black box and train it by using a huge volume of data. Training a network requires sufficient computing resources and extensive time, both of which are rarely found in communication devices. By contrast, model-driven DL approaches combine communication domain knowledge with DL to reduce the demand for computing resources and training time. This article reviews the recent advancements in the application of model-driven DL approaches in physical layer communications, including transmission scheme, receiver design, and channel information recovery. Several open issues for further research are also highlighted after presenting the comprehensive survey.