Yongpeng Wu

IT
h-index21
15papers
254citations
Novelty46%
AI Score53

15 Papers

SPJun 1
Multi-view imaging in networked sensing systems: A covariance-based approach

Junyuan Gao, Weifeng Zhu, Yanmo Hu et al.

This paper considers multi-view imaging in a sixth-generation (6G) integrated sensing and communication network, which consists of a transmit base-station (BS), multiple receive BSs connected to a central processing unit (CPU), and multiple extended targets. Our goal is to devise an effective multi-view imaging technique that can jointly leverage the targets' echo signals at all the receive BSs to precisely construct the image of these targets. To achieve this goal, we propose a two-phase approach. In Phase I, each receive BS recovers an individual image based on the sample covariance matrix of its received signals. Specifically, we propose a novel covariance-based imaging framework to jointly estimate effective scattering intensity and grid positions, which reduces the number of estimated parameters leveraging channel statistical properties and allows grid adjustment to conform to target geometry. In Phase II, the CPU fuses the individual images of all the receivers to construct a high-quality image of all the targets. Specifically, we design edge-preserving natural neighbor interpolation (EP-NNI) to map individual heterogeneous images onto common and finer grids, and then propose a joint optimization framework to estimate fused scattering intensity and BS fields of view. Extensive numerical results show that the proposed scheme significantly enhances imaging performance, facilitating high-quality environment reconstruction for future 6G networks.

LGAug 7, 2023
Communication-Efficient Framework for Distributed Image Semantic Wireless Transmission

Bingyan Xie, Yongpeng Wu, Yuxuan Shi et al.

Multi-node communication, which refers to the interaction among multiple devices, has attracted lots of attention in many Internet-of-Things (IoT) scenarios. However, its huge amounts of data flows and inflexibility for task extension have triggered the urgent requirement of communication-efficient distributed data transmission frameworks. In this paper, inspired by the great superiorities on bandwidth reduction and task adaptation of semantic communications, we propose a federated learning-based semantic communication (FLSC) framework for multi-task distributed image transmission with IoT devices. Federated learning enables the design of independent semantic communication link of each user while further improves the semantic extraction and task performance through global aggregation. Each link in FLSC is composed of a hierarchical vision transformer (HVT)-based extractor and a task-adaptive translator for coarse-to-fine semantic extraction and meaning translation according to specific tasks. In order to extend the FLSC into more realistic conditions, we design a channel state information-based multiple-input multiple-output transmission module to combat channel fading and noise. Simulation results show that the coarse semantic information can deal with a range of image-level tasks. Moreover, especially in low signal-to-noise ratio and channel bandwidth ratio regimes, FLSC evidently outperforms the traditional scheme, e.g. about 10 peak signal-to-noise ratio gain in the 3 dB channel condition.

IVApr 2, 2022
Automatic Registration of Images with Inconsistent Content Through Line-Support Region Segmentation and Geometrical Outlier Removal

Ming Zhao, Yongpeng Wu, Shengda Pan et al.

The implementation of automatic image registration is still difficult in various applications. In this paper, an automatic image registration approach through line-support region segmentation and geometrical outlier removal (ALRS-GOR) is proposed. This new approach is designed to address the problems associated with the registration of images with affine deformations and inconsistent content, such as remote sensing images with different spectral content or noise interference, or map images with inconsistent annotations. To begin with, line-support regions, namely a straight region whose points share roughly the same image gradient angle, are extracted to address the issues of inconsistent content existing in images. To alleviate the incompleteness of line segments, an iterative strategy with multi-resolution is employed to preserve global structures that are masked at full resolution by image details or noise. Then, Geometrical Outlier Removal (GOR) is developed to provide reliable feature point matching, which is based on affineinvariant geometrical classifications for corresponding matches initialized by SIFT. The candidate outliers are selected by comparing the disparity of accumulated classifications among all matches, instead of conventional methods which only rely on local geometrical relations. Various image sets have been considered in this paper for the evaluation of the proposed approach, including aerial images with simulated affine deformations, remote sensing optical and synthetic aperture radar images taken at different situations (multispectral, multisensor, and multitemporal), and map images with inconsistent annotations. Experimental results demonstrate the superior performance of the proposed method over the existing approaches for the whole data set.

IVApr 2, 2022
RFVTM: A Recovery and Filtering Vertex Trichotomy Matching for Remote Sensing Image Registration

Ming Zhao, Bowen An, Yongpeng Wu et al.

Reliable feature point matching is a vital yet challenging process in feature-based image registration. In this paper,a robust feature point matching algorithm called Recovery and Filtering Vertex Trichotomy Matching (RFVTM) is proposed to remove outliers and retain sufficient inliers for remote sensing images. A novel affine invariant descriptor called vertex trichotomy descriptor is proposed on the basis of that geometrical relations between any of vertices and lines are preserved after affine transformations, which is constructed by mapping each vertex into trichotomy sets. The outlier removals in Vertex Trichotomy Matching (VTM) are implemented by iteratively comparing the disparity of corresponding vertex trichotomy descriptors. Some inliers mistakenly validated by a large amount of outliers are removed in VTM iterations, and several residual outliers close to correct locations cannot be excluded with the same graph structures. Therefore, a recovery and filtering strategy is designed to recover some inliers based on identical vertex trichotomy descriptors and restricted transformation errors. Assisted with the additional recovered inliers, residual outliers can also be filtered out during the process of reaching identical graph for the expanded vertex sets. Experimental results demonstrate the superior performance on precision and stability of this algorithm under various conditions, such as remote sensing images with large transformations, duplicated patterns, or inconsistent spectral content.

ITApr 19
Node-Based Soft-Output Fast Successive Cancellation List Decoding of Polar Codes

Li Shen, Yongpeng Wu, Zhen Gao et al.

The soft-output successive cancellation list (SO-SCL) decoder provides a methodology for estimating the a-posteriori probability log-likelihood ratios by only leveraging the conventional SCL decoder of polar codes. However, the sequential decoding nature of SCL introduces high decoding latency to SO-SCL. In this paper, we incorporate node-based fast decoding into the SO-SCL framework. After addressing the challenge of soft output extraction in special node decoding, we proposed the soft-output fast SCL (SO-FSCL) decoding algorithm, along with its log-domain implementation and hardware-friendly version. The proposed SO-FSCL decoder can be regarded as an add-on extension to FSCL decoder, enabling us to autonomously choose whether to output only hard decisions like FSCL or to provide additional soft outputs. Latency and complexity analyses demonstrate that SO-FSCL can significantly reduce, for example, decoding time steps by 81.8\% (with unlimited resources), the number of additions by 41.3\%, and the number of comparisons by 46.4\%. Meanwhile, simulation results indicate that SO-FSCL delivers almost the same soft-output performance as SO-SCL, outperforming other soft-output polar decoders, especially in scenarios involving iterative decoding.

MMNov 4, 2025
Wireless Video Semantic Communication with Decoupled Diffusion Multi-frame Compensation

Bingyan Xie, Yongpeng Wu, Yuxuan Shi et al.

Existing wireless video transmission schemes directly conduct video coding in pixel level, while neglecting the inner semantics contained in videos. In this paper, we propose a wireless video semantic communication framework with decoupled diffusion multi-frame compensation (DDMFC), abbreviated as WVSC-D, which integrates the idea of semantic communication into wireless video transmission scenarios. WVSC-D first encodes original video frames as semantic frames and then conducts video coding based on such compact representations, enabling the video coding in semantic level rather than pixel level. Moreover, to further reduce the communication overhead, a reference semantic frame is introduced to substitute motion vectors of each frame in common video coding methods. At the receiver, DDMFC is proposed to generate compensated current semantic frame by a two-stage conditional diffusion process. With both the reference frame transmission and DDMFC frame compensation, the bandwidth efficiency improves with satisfying video transmission performance. Experimental results verify the performance gain of WVSC-D over other DL-based methods e.g. DVSC about 1.8 dB in terms of PSNR.

MMMay 3
Contextual Wireless Video Semantic Communication in MIMO-OFDM Systems

Bingyan Xie, Cong Zhou, Yuxuan Shi et al.

This paper proposes a MIMO-OFDM-based context video semantic transmission framework, namely M-CVST, for robust video communication over multi-path multiple-input multiple-output (MIMO) channels. It introduces a context-subcarrier correlation map that aligns video feature context with groups of MIMO subcarriers. To leverage the time-correlated nature of multi-path channels, a recursive subcarrier sampling method paired with time-correlated reference embedding is designed, enabling the use of previously sampled MIMO subcarrier CSI to enhance channel state awareness in the entropy coding model. Numerical results verify the superiority of proposed M-CVST over MIMO multi-path channels compared to other semantic schemes and traditional separated schemes.

AINov 11, 2024
Multi-modal Iterative and Deep Fusion Frameworks for Enhanced Passive DOA Sensing via a Green Massive H2AD MIMO Receiver

Jiatong Bai, Minghao Chen, Wankai Tang et al.

Most existing DOA estimation methods assume ideal source incident angles with minimal noise. Moreover, directly using pre-estimated angles to calculate weighted coefficients can lead to performance loss. Thus, a green multi-modal (MM) fusion DOA framework is proposed to realize a more practical, low-cost and high time-efficiency DOA estimation for a H$^2$AD array. Firstly, two more efficient clustering methods, global maximum cos\_similarity clustering (GMaxCS) and global minimum distance clustering (GMinD), are presented to infer more precise true solutions from the candidate solution sets. Based on this, an iteration weighted fusion (IWF)-based method is introduced to iteratively update weighted fusion coefficients and the clustering center of the true solution classes by using the estimated values. Particularly, the coarse DOA calculated by fully digital (FD) subarray, serves as the initial cluster center. The above process yields two methods called MM-IWF-GMaxCS and MM-IWF-GMinD. To further provide a higher-accuracy DOA estimation, a fusion network (fusionNet) is proposed to aggregate the inferred two-part true angles and thus generates two effective approaches called MM-fusionNet-GMaxCS and MM-fusionNet-GMinD. The simulation outcomes show the proposed four approaches can achieve the ideal DOA performance and the CRLB. Meanwhile, proposed MM-fusionNet-GMaxCS and MM-fusionNet-GMinD exhibit superior DOA performance compared to MM-IWF-GMaxCS and MM-IWF-GMinD, especially in extremely-low SNR range.

MMMar 27, 2025
WVSC: Wireless Video Semantic Communication with Multi-frame Compensation

Bingyan Xie, Yongpeng Wu, Yuxuan Shi et al.

Existing wireless video transmission schemes directly conduct video coding in pixel level, while neglecting the inner semantics contained in videos. In this paper, we propose a wireless video semantic communication framework, abbreviated as WVSC, which integrates the idea of semantic communication into wireless video transmission scenarios. WVSC first encodes original video frames as semantic frames and then conducts video coding based on such compact representations, enabling the video coding in semantic level rather than pixel level. Moreover, to further reduce the communication overhead, a reference semantic frame is introduced to substitute motion vectors of each frame in common video coding methods. At the receiver, multi-frame compensation (MFC) is proposed to produce compensated current semantic frame with a multi-frame fusion attention module. With both the reference frame transmission and MFC, the bandwidth efficiency improves with satisfying video transmission performance. Experimental results verify the performance gain of WVSC over other DL-based methods e.g. DVSC about 1 dB and traditional schemes about 2 dB in terms of PSNR.

NIMar 18, 2025
Multi-user Wireless Image Semantic Transmission over MIMO Multiple Access Channels

Bingyan Xie, Yongpeng Wu, Feng Shu et al.

This paper focuses on a typical uplink transmission scenario over multiple-input multiple-output multiple access channel (MIMO-MAC) and thus propose a multi-user learnable CSI fusion semantic communication (MU-LCFSC) framework. It incorporates CSI as the side information into both the semantic encoders and decoders to generate a proper feature mask map in order to produce a more robust attention weight distribution. Especially for the decoding end, a cooperative successive interference cancellation procedure is conducted along with a cooperative mask ratio generator, which flexibly controls the mask elements of feature mask maps. Numerical results verify the superiority of proposed MU-LCFSC compared to DeepJSCC-NOMA over 3 dB in terms of PSNR.

LGFeb 11, 2025
Learnable Residual-Based Latent Denoising in Semantic Communication

Mingkai Xu, Yongpeng Wu, Yuxuan Shi et al.

A latent denoising semantic communication (SemCom) framework is proposed for robust image transmission over noisy channels. By incorporating a learnable latent denoiser into the receiver, the received signals are preprocessed to effectively remove the channel noise and recover the semantic information, thereby enhancing the quality of the decoded images. Specifically, a latent denoising mapping is established by an iterative residual learning approach to improve the denoising efficiency while ensuring stable performance. Moreover, channel signal-to-noise ratio (SNR) is utilized to estimate and predict the latent similarity score (SS) for conditional denoising, where the number of denoising steps is adapted based on the predicted SS sequence, further reducing the communication latency. Finally, simulations demonstrate that the proposed framework can effectively and efficiently remove the channel noise at various levels and reconstruct visual-appealing images.

SPDec 17, 2025
Large Model Enabled Embodied Intelligence for 6G Integrated Perception, Communication, and Computation Network

Zhuoran Li, Zhen Gao, Xinhua Liu et al.

The advent of sixth-generation (6G) places intelligence at the core of wireless architecture, fusing perception, communication, and computation into a single closed-loop. This paper argues that large artificial intelligence models (LAMs) can endow base stations with perception, reasoning, and acting capabilities, thus transforming them into intelligent base station agents (IBSAs). We first review the historical evolution of BSs from single-functional analog infrastructure to distributed, software-defined, and finally LAM-empowered IBSA, highlighting the accompanying changes in architecture, hardware platforms, and deployment. We then present an IBSA architecture that couples a perception-cognition-execution pipeline with cloud-edge-end collaboration and parameter-efficient adaptation. Subsequently,we study two representative scenarios: (i) cooperative vehicle-road perception for autonomous driving, and (ii) ubiquitous base station support for low-altitude uncrewed aerial vehicle safety monitoring and response against unauthorized drones. On this basis, we analyze key enabling technologies spanning LAM design and training, efficient edge-cloud inference, multi-modal perception and actuation, as well as trustworthy security and governance. We further propose a holistic evaluation framework and benchmark considerations that jointly cover communication performance, perception accuracy, decision-making reliability, safety, and energy efficiency. Finally, we distill open challenges on benchmarks, continual adaptation, trustworthy decision-making, and standardization. Together, this work positions LAM-enabled IBSAs as a practical path toward integrated perception, communication, and computation native, safety-critical 6G systems.

ITJan 15, 2018
Two High-performance Schemes of Transmit Antenna Selection for Secure Spatial Modulation

Feng Shu, Zhengwang Wang, Riqing Chen et al.

In this paper, a secure spatial modulation (SM) system with artificial noise (AN)-aided is investigated. To achieve higher secrecy rate (SR) in such a system, two high-performance schemes of transmit antenna selection (TAS), leakage-based and maximum secrecy rate (Max-SR), are proposed and a generalized Euclidean distance-optimized antenna selection (EDAS) method is designed. From simulation results and analysis, the four TAS schemes have an decreasing order: Max-SR, leakage-based, generalized EDAS, and random (conventional), in terms of SR performance. However, the proposed Max-SR method requires the exhaustive search to achieve the optimal SR performance, thus its complexity is extremely high as the number of antennas tends to medium and large scale. The proposed leakage-based method approaches the Max-SR method with much lower complexity. Thus, it achieves a good balance between complexity and SR performance. In terms of bit error rate (BER), their performances are in an increasing order: random, leakage-based, Max-SR, and generalized EDAS.

LGDec 16, 2017
A Machine Learning Framework for Resource Allocation Assisted by Cloud Computing

Jun-Bo Wang, Junyuan Wang, Yongpeng Wu et al.

Conventionally, the resource allocation is formulated as an optimization problem and solved online with instantaneous scenario information. Since most resource allocation problems are not convex, the optimal solutions are very difficult to be obtained in real time. Lagrangian relaxation or greedy methods are then often employed, which results in performance loss. Therefore, the conventional methods of resource allocation are facing great challenges to meet the ever-increasing QoS requirements of users with scarce radio resource. Assisted by cloud computing, a huge amount of historical data on scenarios can be collected for extracting similarities among scenarios using machine learning. Moreover, optimal or near-optimal solutions of historical scenarios can be searched offline and stored in advance. When the measured data of current scenario arrives, the current scenario is compared with historical scenarios to find the most similar one. Then, the optimal or near-optimal solution in the most similar historical scenario is adopted to allocate the radio resources for the current scenario. To facilitate the application of new design philosophy, a machine learning framework is proposed for resource allocation assisted by cloud computing. An example of beam allocation in multi-user massive multiple-input-multiple-output (MIMO) systems shows that the proposed machine-learning based resource allocation outperforms conventional methods.

ITDec 6, 2017
Secure Directional Modulation to Enhance Physical Layer Security in IoT Networks

Feng Shu, Siming Wan, Shihao Yan et al.

In this work, an adaptive and robust null-space projection (AR-NSP) scheme is proposed for secure transmission with artificial noise (AN)-aided directional modulation (DM) in wireless networks. The proposed scheme is carried out in three steps. Firstly, the directions of arrival (DOAs) of the signals from the desired user and eavesdropper are estimated by the Root Multiple Signal Classificaiton (Root-MUSIC) algorithm and the related signal-to-noise ratios (SNRs) are estimated based on the ratio of the corresponding eigenvalue to the minimum eigenvalue of the covariance matrix of the received signals. In the second step, the value intervals of DOA estimation errors are predicted based on the DOA and SNR estimations. Finally, a robust NSP beamforming DM system is designed according to the afore-obtained estimations and predictions. Our examination shows that the proposed scheme can significantly outperform the conventional non-adaptive robust scheme and non-robust NSP scheme in terms of achieving a much lower bit error rate (BER) at the desired user and a much higher secrecy rate (SR). In addition, the BER and SR performance gains achieved by the proposed scheme relative to other schemes increase with the value range of DOA estimation error.