Xiang-Gen Xia

IT
h-index29
26papers
414citations
Novelty49%
AI Score51

26 Papers

CVSep 16, 2022
LO-Det: Lightweight Oriented Object Detection in Remote Sensing Images

Zhanchao Huang, Wei Li, Xiang-Gen Xia et al.

A few lightweight convolutional neural network (CNN) models have been recently designed for remote sensing object detection (RSOD). However, most of them simply replace vanilla convolutions with stacked separable convolutions, which may not be efficient due to a lot of precision losses and may not be able to detect oriented bounding boxes (OBB). Also, the existing OBB detection methods are difficult to constrain the shape of objects predicted by CNNs accurately. In this paper, we propose an effective lightweight oriented object detector (LO-Det). Specifically, a channel separation-aggregation (CSA) structure is designed to simplify the complexity of stacked separable convolutions, and a dynamic receptive field (DRF) mechanism is developed to maintain high accuracy by customizing the convolution kernel and its perception range dynamically when reducing the network complexity. The CSA-DRF component optimizes efficiency while maintaining high accuracy. Then, a diagonal support constraint head (DSC-Head) component is designed to detect OBBs and constrain their shapes more accurately and stably. Extensive experiments on public datasets demonstrate that the proposed LO-Det can run very fast even on embedded devices with the competitive accuracy of detecting oriented objects.

CVSep 6, 2022
Task-wise Sampling Convolutions for Arbitrary-Oriented Object Detection in Aerial Images

Zhanchao Huang, Wei Li, Xiang-Gen Xia et al.

Arbitrary-oriented object detection (AOOD) has been widely applied to locate and classify objects with diverse orientations in remote sensing images. However, the inconsistent features for the localization and classification tasks in AOOD models may lead to ambiguity and low-quality object predictions, which constrains the detection performance. In this article, an AOOD method called task-wise sampling convolutions (TS-Conv) is proposed. TS-Conv adaptively samples task-wise features from respective sensitive regions and maps these features together in alignment to guide a dynamic label assignment for better predictions. Specifically, sampling positions of the localization convolution in TS-Conv are supervised by the oriented bounding box (OBB) prediction associated with spatial coordinates, while sampling positions and convolutional kernel of the classification convolution are designed to be adaptively adjusted according to different orientations for improving the orientation robustness of features. Furthermore, a dynamic task-consistent-aware label assignment (DTLA) strategy is developed to select optimal candidate positions and assign labels dynamically according to ranked task-aware scores obtained from TS-Conv. Extensive experiments on several public datasets covering multiple scenes, multimodal images, and multiple categories of objects demonstrate the effectiveness, scalability, and superior performance of the proposed TS-Conv.

MENov 3, 2015
A Coherent Integration Method Based on Radon Non-uniform FRFT for Random Pulse Repetition Interval (RPRI) Radar

Jing Tian, Xiang-Gen Xia, Gang Yang et al.

To solve the range cell migration (RCM) and spectrum spread during the integration time induced by the motion of a target, this paper proposes a new coherent integration method based on Radon non-uniform FRFT (NUFRFT) for random pulse repetition interval (RPRI) radar. In this method, RCM is eliminated via searching in the motion parameters space and the spectrum spread is resolved by using NUFRFT. Comparisons with other popular methods, moving target detection (MTD), Radon-Fourier transform (RFT), and Radon-Fractional Fourier Transform (RFRFT) are performed. The simulation results demonstrate that the proposed method can detect the moving target even in low SNR scenario and is superior to the other two methods.

80.4ITMay 26
Joint Localization and Orientation with Triple-Beam Fingerprints in Massive MIMO-OFDM

Yu Zhao, Zhenzhou Jin, Jinke Tang et al.

With the widespread application of location-based services, fingerprint-based localization has demonstrated advantages in environments with complex signal propagation. Deep learning has significantly improved the efficiency of both offline training and online matching in localization processes. However, existing fingerprints only contain terminal position information without capturing motion states, and neural network designs have not fully incorporated structural features such as fingerprint sparsity. In this paper, we propose a triple-beam fingerprint (TBF) incorporating Doppler information and design a Transformer-based localization and orientation awareness network (LOA-Net) to simultaneously estimate user position and motion direction in massive multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) systems. We first show the correlation between TBF and multipath information, and investigate the collinearity of different TBFs, demonstrating that TBF is an effective small-size sparse fingerprint. Then, we propose LOA-Net containing a mask-augmented detection Transformer for regression (MaskDETR-Reg) module and a fusion-enhanced Transformer for direction classification (Fusion-TDC) module to process angle-delay domain information and Doppler domain information, respectively. Finally, in the simulation of indoor scenarios defined in 3GPP 38.901, the proposed method achieves significantly better localization accuracy than weighted $K$-nearest neighbors (WKNN), 2D and 3D convolutional neural networks (CNNs), and achieves satisfactory motion direction estimation accuracy.

CVDec 4, 2025
UniTS: Unified Time Series Generative Model for Remote Sensing

Yuxiang Zhang, Shunlin Liang, Wenyuan Li et al.

One of the primary objectives of satellite remote sensing is to capture the complex dynamics of the Earth environment, which encompasses tasks such as reconstructing continuous cloud-free time series images, detecting land cover changes, and forecasting future surface evolution. However, existing methods typically require specialized models tailored to different tasks, lacking unified modeling of spatiotemporal features across multiple time series tasks. In this paper, we propose a Unified Time Series Generative Model (UniTS), a general framework applicable to various time series tasks, including time series reconstruction, time series cloud removal, time series semantic change detection, and time series forecasting. Based on the flow matching generative paradigm, UniTS constructs a deterministic evolution path from noise to targets under the guidance of task-specific conditions, achieving unified modeling of spatiotemporal representations for multiple tasks. The UniTS architecture consists of a diffusion transformer with spatio-temporal blocks, where we design an Adaptive Condition Injector (ACor) to enhance the model's conditional perception of multimodal inputs, enabling high-quality controllable generation. Additionally, we design a Spatiotemporal-aware Modulator (STM) to improve the ability of spatio-temporal blocks to capture complex spatiotemporal dependencies. Furthermore, we construct two high-quality multimodal time series datasets, TS-S12 and TS-S12CR, filling the gap of benchmark datasets for time series cloud removal and forecasting tasks. Extensive experiments demonstrate that UniTS exhibits exceptional generative and cognitive capabilities in both low-level and high-level time series tasks. It significantly outperforms existing methods, particularly when facing challenges such as severe cloud contamination, modality absence, and forecasting phenological variations.

CVAug 22, 2024
BihoT: A Large-Scale Dataset and Benchmark for Hyperspectral Camouflaged Object Tracking

Hanzheng Wang, Wei Li, Xiang-Gen Xia et al.

Hyperspectral object tracking (HOT) has exhibited potential in various applications, particularly in scenes where objects are camouflaged. Existing trackers can effectively retrieve objects via band regrouping because of the bias in existing HOT datasets, where most objects tend to have distinguishing visual appearances rather than spectral characteristics. This bias allows the tracker to directly use the visual features obtained from the false-color images generated by hyperspectral images without the need to extract spectral features. To tackle this bias, we find that the tracker should focus on the spectral information when object appearance is unreliable. Thus, we provide a new task called hyperspectral camouflaged object tracking (HCOT) and meticulously construct a large-scale HCOT dataset, termed BihoT, which consists of 41,912 hyperspectral images covering 49 video sequences. The dataset covers various artificial camouflage scenes where objects have similar appearances, diverse spectrums, and frequent occlusion, making it a very challenging dataset for HCOT. Besides, a simple but effective baseline model, named spectral prompt-based distractor-aware network (SPDAN), is proposed, comprising a spectral embedding network (SEN), a spectral prompt-based backbone network (SPBN), and a distractor-aware module (DAM). Specifically, the SEN extracts spectral-spatial features via 3-D and 2-D convolutions. Then, the SPBN fine-tunes powerful RGB trackers with spectral prompts and alleviates the insufficiency of training samples. Moreover, the DAM utilizes a novel statistic to capture the distractor caused by occlusion from objects and background. Extensive experiments demonstrate that our proposed SPDAN achieves state-of-the-art performance on the proposed BihoT and other HOT datasets.

13.4LGApr 30
Statistical Channel Fingerprint Construction for Massive MIMO: A Unified Tensor Learning Framework

Zhenzhou Jin, Li You, Xiang-Gen Xia et al.

Channel fingerprint (CF) is considered a key enabler for facilitating the acquisition of channel state information (CSI) in massive multiple-input multiple-output (MIMO) communication systems. In this work, we investigate a novel type of CF that stores statistical CSI (sCSI) at each potential location, referred to as statistical CF (sCF). Specifically, we reveal the relationship between sCSI, namely the channel spatial covariance matrix (CSCM), and the channel power angular spectrum (CPAS). Building on this foundation, we construct a unified tensor representation of the sCF and further reduce its dimension by exploiting the eigenvalue decomposition of the CSCM and its correlation with the PAS. Considering the practical constraints imposed by measurement cost, privacy, and security, we focus on three representative scenarios and uniformly formulate them as tensor restoration tasks. To this end, we propose a unified tensor-based learning architecture, termed LPWTNet. The architecture incorporates a closed-form Laplacian pyramid (LP) decomposition and reconstruction framework that replaces the traditional encoder-decoder structure, enabling efficient inference while capturing multi-scale frequency subband characteristics of the sCF. Additionally, a shared mask learning strategy is introduced to adaptively refine high-frequency sCF components through level-wise adjustments. To achieve a larger receptive field without over-parameterization, we further propose a small-kernel convolution mechanism based on the wavelet transform (WT), which decouples convolution across different frequency components of the sCF and enhances feature extraction efficiency. Extensive experiments show that the proposed approach delivers competitive reconstruction accuracy and computational efficiency across various sCF construction scenarios when compared with state-of-the-art baselines.

ITDec 24, 2024
GDM4MMIMO: Generative Diffusion Models for Massive MIMO Communications

Zhenzhou Jin, Li You, Huibin Zhou et al.

Massive multiple-input multiple-output (MIMO) offers significant advantages in spectral and energy efficiencies, positioning it as a cornerstone technology of fifth-generation (5G) wireless communication systems and a promising solution for the burgeoning data demands anticipated in sixth-generation (6G) networks. In recent years, with the continuous advancement of artificial intelligence (AI), a multitude of task-oriented generative foundation models (GFMs) have emerged, achieving remarkable performance in various fields such as computer vision (CV), natural language processing (NLP), and autonomous driving. As a pioneering force, these models are driving the paradigm shift in AI towards generative AI (GenAI). Among them, the generative diffusion model (GDM), as one of state-of-the-art families of generative models, demonstrates an exceptional capability to learn implicit prior knowledge and robust generalization capabilities, thereby enhancing its versatility and effectiveness across diverse applications. In this paper, we delve into the potential applications of GDM in massive MIMO communications. Specifically, we first provide an overview of massive MIMO communication, the framework of GFMs, and the working mechanism of GDM. Following this, we discuss recent research advancements in the field and present a case study of near-field channel estimation based on GDM, demonstrating its promising potential for facilitating efficient ultra-dimensional channel statement information (CSI) acquisition in the context of massive MIMO communications. Finally, we highlight several pressing challenges in future mobile communications and identify promising research directions surrounding GDM.

SPMay 11, 2025
Near-Field Channel Estimation for XL-MIMO: A Deep Generative Model Guided by Side Information

Zhenzhou Jin, Li You, Derrick Wing Kwan Ng et al.

This paper investigates the near-field (NF) channel estimation (CE) for extremely large-scale multiple-input multiple-output (XL-MIMO) systems. Considering the pronounced NF effects in XL-MIMO communications, we first establish a joint angle-distance (AD) domain-based spherical-wavefront physical channel model that captures the inherent sparsity of XL-MIMO channels. Leveraging the channel's sparsity in the joint AD domain, the CE is approached as a task of reconstructing sparse signals. Anchored in this framework, we first propose a compressed sensing algorithm to acquire a preliminary channel estimate. Harnessing the powerful implicit prior learning capability of generative artificial intelligence (GenAI), we further propose a GenAI-based approach to refine the estimated channel. Specifically, we introduce the preliminary estimated channel as side information, and derive the evidence lower bound (ELBO) of the log-marginal distribution of the target NF channel conditioned on the preliminary estimated channel, which serves as the optimization objective for the proposed generative diffusion model (GDM). Additionally, we introduce a more generalized version of the GDM, the non-Markovian GDM (NM-GDM), to accelerate the sampling process, achieving an approximately tenfold enhancement in sampling efficiency. Experimental results indicate that the proposed approach is capable of offering substantial performance gain in CE compared to existing benchmark schemes within NF XL-MIMO systems. Furthermore, our approach exhibits enhanced generalization capabilities in both the NF or far-field (FF) regions.

ITDec 30, 2024
CF-CGN: Channel Fingerprints Extrapolation for Multi-band Massive MIMO Transmission based on Cycle-Consistent Generative Networks

Chenjie Xie, Li You, Zhenzhou Jin et al.

Multi-band massive multiple-input multiple-output (MIMO) communication can promote the cooperation of licensed and unlicensed spectra, effectively enhancing spectrum efficiency for Wi-Fi and other wireless systems. As an enabler for multi-band transmission, channel fingerprints (CF), also known as the channel knowledge map or radio environment map, are used to assist channel state information (CSI) acquisition and reduce computational complexity. In this paper, we propose CF-CGN (Channel Fingerprints with Cycle-consistent Generative Networks) to extrapolate CF for multi-band massive MIMO transmission where licensed and unlicensed spectra cooperate to provide ubiquitous connectivity. Specifically, we first model CF as a multichannel image and transform the extrapolation problem into an image translation task, which converts CF from one frequency to another by exploring the shared characteristics of statistical CSI in the beam domain. Then, paired generative networks are designed and coupled by variable-weight cycle consistency losses to fit the reciprocal relationship at different bands. Matched with the coupled networks, a joint training strategy is developed accordingly, supporting synchronous optimization of all trainable parameters. During the inference process, we also introduce a refining scheme to improve the extrapolation accuracy based on the resolution of CF. Numerical results illustrate that our proposed CF-CGN can achieve bidirectional extrapolation with an error of 5-17 dB lower than the benchmarks in different communication scenarios, demonstrating its excellent generalization ability. We further show that the sum rate performance assisted by CF-CGN-based CF is close to that with perfect CSI for multi-band massive MIMO transmission.

CVMar 9, 2024
SSF-Net: Spatial-Spectral Fusion Network with Spectral Angle Awareness for Hyperspectral Object Tracking

Hanzheng Wang, Wei Li, Xiang-Gen Xia et al.

Hyperspectral video (HSV) offers valuable spatial, spectral, and temporal information simultaneously, making it highly suitable for handling challenges such as background clutter and visual similarity in object tracking. However, existing methods primarily focus on band regrouping and rely on RGB trackers for feature extraction, resulting in limited exploration of spectral information and difficulties in achieving complementary representations of object features. In this paper, a spatial-spectral fusion network with spectral angle awareness (SST-Net) is proposed for hyperspectral (HS) object tracking. Firstly, to address the issue of insufficient spectral feature extraction in existing networks, a spatial-spectral feature backbone ($S^2$FB) is designed. With the spatial and spectral extraction branch, a joint representation of texture and spectrum is obtained. Secondly, a spectral attention fusion module (SAFM) is presented to capture the intra- and inter-modality correlation to obtain the fused features from the HS and RGB modalities. It can incorporate the visual information into the HS spectral context to form a robust representation. Thirdly, to ensure a more accurate response of the tracker to the object position, a spectral angle awareness module (SAAM) investigates the region-level spectral similarity between the template and search images during the prediction stage. Furthermore, we develop a novel spectral angle awareness loss (SAAL) to offer guidance for the SAAM based on similar regions. Finally, to obtain the robust tracking results, a weighted prediction method is considered to combine the HS and RGB predicted motions of objects to leverage the strengths of each modality. Extensive experiments on the HOTC dataset demonstrate the effectiveness of the proposed SSF-Net, compared with state-of-the-art trackers.

LGFeb 11, 2025
Learnable Residual-Based Latent Denoising in Semantic Communication

Mingkai Xu, Yongpeng Wu, Yuxuan Shi et al.

A latent denoising semantic communication (SemCom) framework is proposed for robust image transmission over noisy channels. By incorporating a learnable latent denoiser into the receiver, the received signals are preprocessed to effectively remove the channel noise and recover the semantic information, thereby enhancing the quality of the decoded images. Specifically, a latent denoising mapping is established by an iterative residual learning approach to improve the denoising efficiency while ensuring stable performance. Moreover, channel signal-to-noise ratio (SNR) is utilized to estimate and predict the latent similarity score (SS) for conditional denoising, where the number of denoising steps is adapted based on the predicted SS sequence, further reducing the communication latency. Finally, simulations demonstrate that the proposed framework can effectively and efficiently remove the channel noise at various levels and reconstruct visual-appealing images.

IVSep 5, 2025
Multi-modal Uncertainty Robust Tree Cover Segmentation For High-Resolution Remote Sensing Images

Yuanyuan Gui, Wei Li, Yinjian Wang et al.

Recent advances in semantic segmentation of multi-modal remote sensing images have significantly improved the accuracy of tree cover mapping, supporting applications in urban planning, forest monitoring, and ecological assessment. Integrating data from multiple modalities-such as optical imagery, light detection and ranging (LiDAR), and synthetic aperture radar (SAR)-has shown superior performance over single-modality methods. However, these data are often acquired days or even months apart, during which various changes may occur, such as vegetation disturbances (e.g., logging, and wildfires) and variations in imaging quality. Such temporal misalignments introduce cross-modal uncertainty, especially in high-resolution imagery, which can severely degrade segmentation accuracy. To address this challenge, we propose MURTreeFormer, a novel multi-modal segmentation framework that mitigates and leverages aleatoric uncertainty for robust tree cover mapping. MURTreeFormer treats one modality as primary and others as auxiliary, explicitly modeling patch-level uncertainty in the auxiliary modalities via a probabilistic latent representation. Uncertain patches are identified and reconstructed from the primary modality's distribution through a VAE-based resampling mechanism, producing enhanced auxiliary features for fusion. In the decoder, a gradient magnitude attention (GMA) module and a lightweight refinement head (RH) are further integrated to guide attention toward tree-like structures and to preserve fine-grained spatial details. Extensive experiments on multi-modal datasets from Shanghai and Zurich demonstrate that MURTreeFormer significantly improves segmentation performance and effectively reduces the impact of temporally induced aleatoric uncertainty.

NIMay 12, 2025
Channel Fingerprint Construction for Massive MIMO: A Deep Conditional Generative Approach

Zhenzhou Jin, Li You, Xudong Li et al.

Accurate channel state information (CSI) acquisition for massive multiple-input multiple-output (MIMO) systems is essential for future mobile communication networks. Channel fingerprint (CF), also referred to as channel knowledge map, is a key enabler for intelligent environment-aware communication and can facilitate CSI acquisition. However, due to the cost limitations of practical sensing nodes and test vehicles, the resulting CF is typically coarse-grained, making it insufficient for wireless transceiver design. In this work, we introduce the concept of CF twins and design a conditional generative diffusion model (CGDM) with strong implicit prior learning capabilities as the computational core of the CF twin to establish the connection between coarse- and fine-grained CFs. Specifically, we employ a variational inference technique to derive the evidence lower bound (ELBO) for the log-marginal distribution of the observed fine-grained CF conditioned on the coarse-grained CF, enabling the CGDM to learn the complicated distribution of the target data. During the denoising neural network optimization, the coarse-grained CF is introduced as side information to accurately guide the conditioned generation of the CGDM. To make the proposed CGDM lightweight, we further leverage the additivity of network layers and introduce a one-shot pruning approach along with a multi-objective knowledge distillation technique. Experimental results show that the proposed approach exhibits significant improvement in reconstruction performance compared to the baselines. Additionally, zero-shot testing on reconstruction tasks with different magnification factors further demonstrates the scalability and generalization ability of the proposed approach.

NIMay 12, 2025
EnvCDiff: Joint Refinement of Environmental Information and Channel Fingerprints via Conditional Generative Diffusion Model

Zhenzhou Jin, Li You, Xiang-Gen Xia et al.

The paradigm shift from environment-unaware communication to intelligent environment-aware communication is expected to facilitate the acquisition of channel state information for future wireless communications. Channel Fingerprint (CF), as an emerging enabling technology for environment-aware communication, provides channel-related knowledge for potential locations within the target communication area. However, due to the limited availability of practical devices for sensing environmental information and measuring channel-related knowledge, most of the acquired environmental information and CF are coarse-grained, insufficient to guide the design of wireless transmissions. To address this, this paper proposes a deep conditional generative learning approach, namely a customized conditional generative diffusion model (CDiff). The proposed CDiff simultaneously refines environmental information and CF, reconstructing a fine-grained CF that incorporates environmental information, referred to as EnvCF, from its coarse-grained counterpart. Experimental results show that the proposed approach significantly improves the performance of EnvCF construction compared to the baselines.

ITJun 14, 2024
An I2I Inpainting Approach for Efficient Channel Knowledge Map Construction

Zhenzhou Jin, Li You, Jue Wang et al.

Channel knowledge map (CKM) has received widespread attention as an emerging enabling technology for environment-aware wireless communications. It involves the construction of databases containing location-specific channel knowledge, which are then leveraged to facilitate channel state information (CSI) acquisition and transceiver design. In this context, a fundamental challenge lies in efficiently constructing the CKM based on a given wireless propagation environment. Most existing methods are based on stochastic modeling and sequence prediction, which do not fully exploit the inherent physical characteristics of the propagation environment, resulting in low accuracy and high computational complexity. To address these limitations, we propose a Laplacian pyramid (LP)-based CKM construction scheme to predict the channel knowledge at arbitrary locations in a targeted area. Specifically, we first view the channel knowledge as a 2-D image and transform the CKM construction problem into an image-to-image (I2I) inpainting task, which predicts the channel knowledge at a specific location by recovering the corresponding pixel value in the image matrix. Then, inspired by the reversible and closed-form structure of the LP, we show its natural suitability for our task in designing a fast I2I mapping network. For different frequency components of LP decomposition, we design tailored networks accordingly. Besides, to encode the global structural information of the propagation environment, we introduce self-attention and cross-covariance attention mechanisms in different layers, respectively. Finally, experimental results show that the proposed scheme outperforms the benchmark, achieving higher reconstruction accuracy while with lower computational complexity. Moreover, the proposed approach has a strong generalization ability and can be implemented in different wireless communication scenarios.

IVJan 7, 2022
A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction

Wei Wang, Xiang-Gen Xia, Chuanjiang He et al.

In this paper, we propose a new GPU implementation of the Katsevich algorithm for helical CT reconstruction. Our implementation divides the sinograms and reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our method only needs to calculate these parameters once for all the pitches and so has lower GPU-memory burdens and is very suitable for deep learning. By embedding our implementation into the network, we propose an end-to-end deep network for the high pitch helical CT reconstruction with sparse detectors. Since our network utilizes the features extracted from both sinograms and CT images, it can simultaneously reduce the streak artifacts caused by the sparsity of sinograms and preserve fine details in the CT images. Experiments show that our network outperforms the related methods both in subjective and objective evaluations.

CVSep 27, 2021
A General Gaussian Heatmap Label Assignment for Arbitrary-Oriented Object Detection

Zhanchao Huang, Wei Li, Xiang-Gen Xia et al.

Recently, many arbitrary-oriented object detection (AOOD) methods have been proposed and attracted widespread attention in many fields. However, most of them are based on anchor-boxes or standard Gaussian heatmaps. Such label assignment strategy may not only fail to reflect the shape and direction characteristics of arbitrary-oriented objects, but also have high parameter-tuning efforts. In this paper, a novel AOOD method called General Gaussian Heatmap Label Assignment (GGHL) is proposed. Specifically, an anchor-free object-adaptation label assignment (OLA) strategy is presented to define the positive candidates based on two-dimensional (2-D) oriented Gaussian heatmaps, which reflect the shape and direction features of arbitrary-oriented objects. Based on OLA, an oriented-bounding-box (OBB) representation component (ORC) is developed to indicate OBBs and adjust the Gaussian center prior weights to fit the characteristics of different objects adaptively through neural network learning. Moreover, a joint-optimization loss (JOL) with area normalization and dynamic confidence weighting is designed to refine the misalign optimal results of different subtasks. Extensive experiments on public datasets demonstrate that the proposed GGHL improves the AOOD performance with low parameter-tuning and time costs. Furthermore, it is generally applicable to most AOOD methods to improve their performance including lightweight models on embedded platforms.

IVJan 6, 2021
A New Weighting Scheme for Fan-beam and Circle Cone-beam CT Reconstructions

Wei Wang, Xiang-Gen Xia, Chuanjiang He et al.

In this paper, we first present an arc based algorithm for fan-beam computed tomography (CT) reconstruction via applying Katsevich's helical CT formula to 2D fan-beam CT reconstruction. Then, we propose a new weighting function to deal with the redundant projection data. By extending the weighted arc based fan-beam algorithm to circle cone-beam geometry, we also obtain a new FDK-similar algorithm for circle cone-beam CT reconstruction. Experiments show that our methods can obtain higher PSNR and SSIM compared to the Parker-weighted conventional fan-beam algorithm and the FDK algorithm for super-short-scan trajectories.

IVAug 10, 2020
A model-guided deep network for limited-angle computed tomography

Wei Wang, Xiang-Gen Xia, Chuanjiang He et al.

In this paper, we first propose a variational model for the limited-angle computed tomography (CT) image reconstruction and then convert the model into an end-to-end deep network.We use the penalty method to solve the model and divide it into three iterative subproblems, where the first subproblem completes the sinograms by utilizing the prior information of sinograms in the frequency domain and the second refines the CT images by using the prior information of CT images in the spatial domain, and the last merges the outputs of the first two subproblems. In each iteration, we use the convolutional neural networks (CNNs) to approxiamte the solutions of the first two subproblems and, thus, obtain an end-to-end deep network for the limited-angle CT image reconstruction. Our network tackles both the sinograms and the CT images, and can simultaneously suppress the artifacts caused by the incomplete data and recover fine structural information in the CT images. Experimental results show that our method outperforms the existing algorithms for the limited-angle CT image reconstruction.

ITJan 31, 2020
Exact and Robust Reconstructions of Integer Vectors Based on Multidimensional Chinese Remainder Theorem (MD-CRT)

Li Xiao, Xiang-Gen Xia, Yu-Ping Wang

The robust Chinese remainder theorem (CRT) has been recently proposed for robustly reconstructing a large nonnegative integer from erroneous remainders. It has found many applications in signal processing, including phase unwrapping and frequency estimation under sub-Nyquist sampling. Motivated by the applications in multidimensional (MD) signal processing, in this paper we propose the MD-CRT and robust MD-CRT for integer vectors. Specifically, by rephrasing the abstract CRT for rings in number-theoretic terms, we first derive the MD-CRT for integer vectors with respect to a general set of integer matrix moduli, which provides an algorithm to uniquely reconstruct an integer vector from its remainders, if it is in the fundamental parallelepiped of the lattice generated by a least common right multiple of all the moduli. For some special forms of moduli, we present explicit reconstruction formulae. Moreover, we derive the robust MD-CRT for integer vectors when the remaining integer matrices of all the moduli left divided by their greatest common left divisor (gcld) are pairwise commutative and coprime. Two different reconstruction algorithms are proposed, and accordingly, two different conditions on the remainder error bound for the reconstruction robustness are obtained, which are related to a quarter of the minimum distance of the lattice generated by the gcld of all the moduli or the Smith normal form of the gcld.

IVJan 20, 2020
A deep network for sinogram and CT image reconstruction

Wei Wang, Xiang-Gen Xia, Chuanjiang He et al.

A CT image can be well reconstructed when the sampling rate of the sinogram satisfies the Nyquist criteria and the sampled signal is noise-free. However, in practice, the sinogram is usually contaminated by noise, which degrades the quality of a reconstructed CT image. In this paper, we design a deep network for sinogram and CT image reconstruction. The network consists of two cascaded blocks that are linked by a filter backprojection (FBP) layer, where the former block is responsible for denoising and completing the sinograms while the latter is used to removing the noise and artifacts of the CT images. Experimental results show that the reconstructed CT images by our methods have the highest PSNR and SSIM in average compared to state of the art methods.

CVAug 25, 2017
A wavelet frame coefficient total variational model for image restoration

Wei Wang, Xiang-Gen Xia, Shengli Zhang et al.

In this paper, we propose a vector total variation (VTV) of feature image model for image restoration. The VTV imposes different smoothing powers on different features (e.g. edges and cartoons) based on choosing various regularization parameters. Thus, the model can simultaneously preserve edges and remove noises. Next, the existence of solution for the model is proved and the split Bregman algorithm is used to solve the model. At last, we use the wavelet filter banks to explicitly define the feature operator and present some experimental results to show its advantage over the related methods in both quality and efficiency.

ITOct 1, 2016
Radial Velocity Retrieval for Multichannel SAR Moving Targets with Time-Space Doppler De-ambiguity

Jia Xu, Zu-Zhen Huang, Zhi-Rui Wang et al.

In this paper, with respect to multichannel synthetic aperture radars (SAR), we first formulate the problems of Doppler ambiguities on the radial velocity (RV) estimation of a ground moving target in range-compressed domain, range-Doppler domain and image domain, respectively. It is revealed that in these problems, a cascaded time-space Doppler ambiguity (CTSDA) may encounter, i.e., time domain Doppler ambiguity (TDDA) in each channel arises first and then spatial domain Doppler ambiguity (SDDA) among multi-channels arises second. Accordingly, the multichannel SAR systems with different parameters are investigated in three different cases with diverse Doppler ambiguity properties, and a multi-frequency SAR is then proposed to obtain the RV estimation by solving the ambiguity problem based on Chinese remainder theorem (CRT). In the first two cases, the ambiguity problem can be solved by the existing closed-form robust CRT. In the third case, it is found that the problem is different from the conventional CRT problems and we call it a double remaindering problem in this paper. We then propose a sufficient condition under which the double remaindering problem, i.e., the CTSDA, can also be solved by the closed-form robust CRT. When the sufficient condition is not satisfied for a multi-channel SAR, a searching based method is proposed. Finally, some results of numerical experiments are provided to demonstrate the effectiveness of the proposed methods.

ITNov 17, 2015
Artificial-Noise-Aided Message Authentication Codes with Information-Theoretic Security

Xiaofu Wu, Zhen Yang, Cong Ling et al.

In the past, two main approaches for the purpose of authentication, including information-theoretic authentication codes and complexity-theoretic message authentication codes (MACs), were almost independently developed. In this paper, we propose a new cryptographic primitive, namely, artificial-noise-aided MACs (ANA-MACs), which can be considered as both computationally secure and information-theoretically secure. For ANA-MACs, we introduce artificial noise to interfere with the complexity-theoretic MACs and quantization is further employed to facilitate packet-based transmission. With a channel coding formulation of key recovery in the MACs, the generation of standard authentication tags can be seen as an encoding process for the ensemble of codes, where the shared key between Alice and Bob is considered as the input and the message is used to specify a code from the ensemble of codes. Then, we show that the introduction of artificial noise in ANA-MACs can be well employed to resist the key recovery attack even if the opponent has an unlimited computing power. Finally, a pragmatic approach for the analysis of ANA-MACs is provided, and we show how to balance the three performance metrics, including the completeness error, the false acceptance probability, and the conditional equivocation about the key. The analysis can be well applied to a class of ANA-MACs, where MACs with Rijndael cipher are employed.

ITMar 13, 2013
Multi-Stage Robust Chinese Remainder Theorem

Li Xiao, Xiang-Gen Xia, Wenjie Wang

It is well-known that the traditional Chinese remainder theorem (CRT) is not robust in the sense that a small error in a remainder may cause a large error in the reconstruction solution. A robust CRT was recently proposed for a special case when the greatest common divisor (gcd) of all the moduli is more than 1 and the remaining integers factorized by the gcd of all the moduli are co-prime. In this special case, a closed-form reconstruction from erroneous remainders was proposed and a necessary and sufficient condition on the remainder errors was obtained. It basically says that the reconstruction error is upper bounded by the remainder error level $τ$ if $τ$ is smaller than a quarter of the gcd of all the moduli. In this paper, we consider the robust reconstruction problem for a general set of moduli. We first present a necessary and sufficient condition for the remainder errors for a robust reconstruction from erroneous remainders with a general set of muduli and also a corresponding robust reconstruction method. This can be thought of as a single stage robust CRT. We then propose a two-stage robust CRT by grouping the moduli into several groups as follows. First, the single stage robust CRT is applied to each group. Then, with these robust reconstructions from all the groups, the single stage robust CRT is applied again across the groups. This is then easily generalized to multi-stage robust CRT. Interestingly, with this two-stage robust CRT, the robust reconstruction holds even when the remainder error level $τ$ is above the quarter of the gcd of all the moduli. In this paper, we also propose an algorithm on how to group a set of moduli for a better reconstruction robustness of the two-stage robust CRT in some special cases.