IVJun 16, 2022Code
Multi-View Imputation and Cross-Attention Network Based on Incomplete Longitudinal and Multimodal Data for Conversion Prediction of Mild Cognitive ImpairmentTao Wang, Xiumei Chen, Xiaoling Zhang et al.
Predicting whether subjects with mild cognitive impairment (MCI) will convert to Alzheimer's disease is a significant clinical challenge. Longitudinal variations and complementary information inherent in longitudinal and multimodal data are crucial for MCI conversion prediction, but persistent issue of missing data in these data may hinder their effective application. Additionally, conversion prediction should be achieved in the early stages of disease progression in clinical practice, specifically at baseline visit (BL). Therefore, longitudinal data should only be incorporated during training to capture disease progression information. To address these challenges, a multi-view imputation and cross-attention network (MCNet) was proposed to integrate data imputation and MCI conversion prediction in a unified framework. First, a multi-view imputation method combined with adversarial learning was presented to handle various missing data scenarios and reduce imputation errors. Second, two cross-attention blocks were introduced to exploit the potential associations in longitudinal and multimodal data. Finally, a multi-task learning model was established for data imputation, longitudinal classification, and conversion prediction tasks. When the model was appropriately trained, the disease progression information learned from longitudinal data can be leveraged by BL data to improve MCI conversion prediction at BL. MCNet was tested on two independent testing sets and single-modal BL data to verify its effectiveness and flexibility in MCI conversion prediction. Results showed that MCNet outperformed several competitive methods. Moreover, the interpretability of MCNet was demonstrated. Thus, our MCNet may be a valuable tool in longitudinal and multimodal data analysis for MCI conversion prediction. Codes are available at https://github.com/Meiyan88/MCNET.
CVJul 8, 2022
A Mask Attention Interaction and Scale Enhancement Network for SAR Ship Instance SegmentationTianwen Zhang, Xiaoling Zhang
Most of existing synthetic aperture radar (SAR) ship in-stance segmentation models do not achieve mask interac-tion or offer limited interaction performance. Besides, their multi-scale ship instance segmentation performance is moderate especially for small ships. To solve these problems, we propose a mask attention interaction and scale enhancement network (MAI-SE-Net) for SAR ship instance segmentation. MAI uses an atrous spatial pyra-mid pooling (ASPP) to gain multi-resolution feature re-sponses, a non-local block (NLB) to model long-range spa-tial dependencies, and a concatenation shuffle attention block (CSAB) to improve interaction benefits. SE uses a content-aware reassembly of features block (CARAFEB) to generate an extra pyramid bottom-level to boost small ship performance, a feature balance operation (FBO) to improve scale feature description, and a global context block (GCB) to refine features. Experimental results on two public SSDD and HRSID datasets reveal that MAI-SE-Net outperforms the other nine competitive models, better than the suboptimal model by 4.7% detec-tion AP and 3.4% segmentation AP on SSDD and by 3.0% detection AP and 2.4% segmentation AP on HRSID.
CVJul 11, 2022
A Dual-Polarization Information Guided Network for SAR Ship ClassificationTianwen Zhang, Xiaoling Zhang
How to fully utilize polarization to enhance synthetic aperture radar (SAR) ship classification remains an unresolved issue. Thus, we propose a dual-polarization information guided network (DPIG-Net) to solve it.
CVJul 7, 2022
Shadow-Background-Noise 3D Spatial Decomposition Using Sparse Low-Rank Gaussian Properties for Video-SAR Moving Target Shadow EnhancementXiaowo Xu, Xiaoling Zhang, Tianwen Zhang et al.
Moving target shadows among video synthetic aperture radar (Video-SAR) images are always interfered by low scattering backgrounds and cluttered noises, causing poor detec-tion-tracking accuracy. Thus, a shadow-background-noise 3D spatial decomposition (SBN-3D-SD) model is proposed to enhance shadows for higher detection-tracking accuracy. It leverages the sparse property of shadows, the low-rank property of back-grounds, and the Gaussian property of noises to perform 3D spatial three-decomposition. It separates shadows from back-grounds and noises by the alternating direction method of multi-pliers (ADMM). Results on the Sandia National Laboratories (SNL) data verify its effectiveness. It boosts the shadow saliency from the qualitative and quantitative evaluation. It boosts the shadow detection accuracy of Faster R-CNN, RetinaNet and YOLOv3. It also boosts the shadow tracking accuracy of TransTrack, FairMOT and ByteTrack.
CVSep 21, 2022
Sar Ship Detection based on Swin Transformer and Feature Enhancement Feature Pyramid NetworkXiao Ke, Xiaoling Zhang, Tianwen Zhang et al.
With the booming of Convolutional Neural Networks (CNNs), CNNs such as VGG-16 and ResNet-50 widely serve as backbone in SAR ship detection. However, CNN based backbone is hard to model long-range dependencies, and causes the lack of enough high-quality semantic information in feature maps of shallow layers, which leads to poor detection performance in complicated background and small-sized ships cases. To address these problems, we propose a SAR ship detection method based on Swin Transformer and Feature Enhancement Feature Pyramid Network (FEFPN). Swin Transformer serves as backbone to model long-range dependencies and generates hierarchical features maps. FEFPN is proposed to further improve the quality of feature maps by gradually enhancing the semantic information of feature maps at all levels, especially feature maps in shallow layers. Experiments conducted on SAR ship detection dataset (SSDD) reveal the advantage of our proposed methods.
SPNov 28, 2022
A Model-data-driven Network Embedding Multidimensional Features for Tomographic SAR ImagingYu Ren, Xiaoling Zhang, Xu Zhan et al.
Deep learning (DL)-based tomographic SAR imaging algorithms are gradually being studied. Typically, they use an unfolding network to mimic the iterative calculation of the classical compressive sensing (CS)-based methods and process each range-azimuth unit individually. However, only one-dimensional features are effectively utilized in this way. The correlation between adjacent resolution units is ignored directly. To address that, we propose a new model-data-driven network to achieve tomoSAR imaging based on multi-dimensional features. Guided by the deep unfolding methodology, a two-dimensional deep unfolding imaging network is constructed. On the basis of it, we add two 2D processing modules, both convolutional encoder-decoder structures, to enhance multi-dimensional features of the imaging scene effectively. Meanwhile, to train the proposed multifeature-based imaging network, we construct a tomoSAR simulation dataset consisting entirely of simulation data of buildings. Experiments verify the effectiveness of the model. Compared with the conventional CS-based FISTA method and DL-based gamma-Net method, the result of our proposed method has better performance on completeness while having decent imaging accuracy.
IVNov 28, 2022
Near-filed SAR Image Restoration with Deep Learning Inverse Technique: A Preliminary StudyXu Zhan, Xiaoling Zhang, Wensi Zhang et al.
Benefiting from a relatively larger aperture's angle, and in combination with a wide transmitting bandwidth, near-field synthetic aperture radar (SAR) provides a high-resolution image of a target's scattering distribution-hot spots. Meanwhile, imaging result suffers inevitable degradation from sidelobes, clutters, and noises, hindering the information retrieval of the target. To restore the image, current methods make simplified assumptions; for example, the point spread function (PSF) is spatially consistent, the target consists of sparse point scatters, etc. Thus, they achieve limited restoration performance in terms of the target's shape, especially for complex targets. To address these issues, a preliminary study is conducted on restoration with the recent promising deep learning inverse technique in this work. We reformulate the degradation model into a spatially variable complex-convolution model, where the near-field SAR's system response is considered. Adhering to it, a model-based deep learning network is designed to restore the image. A simulated degraded image dataset from multiple complex target models is constructed to validate the network. All the images are formulated using the electromagnetic simulation tool. Experiments on the dataset reveal their effectiveness. Compared with current methods, superior performance is achieved regarding the target's shape and energy estimation.
SPNov 28, 2022
Solving 3D Radar Imaging Inverse Problems with a Multi-cognition Task-oriented FrameworkXu Zhan, Xiaoling Zhang, Mou Wang et al.
This work focuses on 3D Radar imaging inverse problems. Current methods obtain undifferentiated results that suffer task-depended information retrieval loss and thus don't meet the task's specific demands well. For example, biased scattering energy may be acceptable for screen imaging but not for scattering diagnosis. To address this issue, we propose a new task-oriented imaging framework. The imaging principle is task-oriented through an analysis phase to obtain task's demands. The imaging model is multi-cognition regularized to embed and fulfill demands. The imaging method is designed to be general-ized, where couplings between cognitions are decoupled and solved individually with approximation and variable-splitting techniques. Tasks include scattering diagnosis, person screen imaging, and parcel screening imaging are given as examples. Experiments on data from two systems indicate that the pro-posed framework outperforms the current ones in task-depended information retrieval.
16.2CEApr 25
UAV Trajectory and Bandwidth Allocation for Efficient Data Collection in Low-Altitude Intelligent IoT: A Hierarchical DRL ApproachZhenjia Xu, Xiaoling Zhang, Nan Qi et al.
Under the 6G wireless network evolution, the low-altitude Internet of Things (IoT), supported by unmanned aerial vehicles (UAVs) with Integrated Sensing and Communication (ISAC) capabilities, provides ground sensing networks with advanced real-time monitoring and data collection. To maximize data collection volume from distributed IoT nodes, AI-powered data collection technology plays a critical role in enabling intelligent decision-making. Among them, deep reinforcement learning (DRL) has gained particular attention. However, the existing DRL-based work on UAV-assisted IoT nodes data collection rarely address problems such as unknown interference and dynamic data volume. Moreover, these DRL models have high arithmetic requirements and slow convergence speed, making it difficult to carry on UAVs with limited load and arithmetic power. To address these challenges, a hierarchical deep reinforcement learning (HDRL), which can converge quickly and with smaller models, is designed to optimize UAV trajectories and bandwidth allocation to maximize data collection volume. Firstly, the proposed scenario incorporates interference from jammers, dynamic data volume of IoT nodes, and multiple types of obstacles. The entire task is hierarchically structured: the upper-level makes flight trajectory decisions at a coarse temporal granularity, while the lower-level makes bandwidth allocation decisions at a finer temporal granularity. Secondly, a trajectory and bandwidth allocation optimization algorithm based on hierarchical deep deterministic policy gradients (TBH-DDPG) is proposed to solve the problem. Finally, simulation results demonstrate that the proposed algorithm improves convergence speed by 44.44%, and reduces computational cost by 58.05%, compared to non-hierarchical algorithm.
IVDec 22, 2024
Technical Report: Towards Spatial Feature Regularization in Deep-Learning-Based Array-SAR ReconstructionYu Ren, Xu Zhan, Yunqiao Hu et al.
Array synthetic aperture radar (Array-SAR), also known as tomographic SAR (TomoSAR), has demonstrated significant potential for high-quality 3D mapping, particularly in urban areas.While deep learning (DL) methods have recently shown strengths in reconstruction, most studies rely on pixel-by-pixel reconstruction, neglecting spatial features like building structures, leading to artifacts such as holes and fragmented edges. Spatial feature regularization, effective in traditional methods, remains underexplored in DL-based approaches. Our study integrates spatial feature regularization into DL-based Array-SAR reconstruction, addressing key questions: What spatial features are relevant in urban-area mapping? How can these features be effectively described, modeled, regularized, and incorporated into DL networks? The study comprises five phases: spatial feature description and modeling, regularization, feature-enhanced network design, evaluation, and discussions. Sharp edges and geometric shapes in urban scenes are analyzed as key features. An intra-slice and inter-slice strategy is proposed, using 2D slices as reconstruction units and fusing them into 3D scenes through parallel and serial fusion. Two computational frameworks-iterative reconstruction with enhancement and light reconstruction with enhancement-are designed, incorporating spatial feature modules into DL networks, leading to four specialized reconstruction networks. Using our urban building simulation dataset and two public datasets, six tests evaluate close-point resolution, structural integrity, and robustness in urban scenarios. Results show that spatial feature regularization significantly improves reconstruction accuracy, retrieves more complete building structures, and enhances robustness by reducing noise and outliers.
MLOct 4, 2021
Stochastic tensor space feature theory with applications to robust machine learningJulio Enrique Castrillon-Candas, Dingning Liu, Sicheng Yang et al.
In this paper we develop a Multilevel Orthogonal Subspace (MOS) Karhunen-Loeve feature theory based on stochastic tensor spaces, for the construction of robust machine learning features. Training data is treated as instances of a random field within a relevant Bochner space. Our key observation is that separate machine learning classes can reside predominantly in mostly distinct subspaces. Using the Karhunen-Loeve expansion and a hierarchical expansion of the first (nominal) class, a MOS is constructed to detect anomalous signal components, treating the second class as an outlier of the first. The projection coefficients of the input data into these subspaces are then used to train a Machine Learning (ML) classifier. These coefficients become new features from which much clearer separation surfaces can arise for the underlying classes. Tests in the blood plasma dataset (Alzheimer's Disease Neuroimaging Initiative) show dramatic increases in accuracy. This is in contrast to popular ML methods such as Gradient Boosting, RUS Boost, Random Forest and (Convolutional) Neural Networks.
CVJul 21, 2020
Balance Scene Learning Mechanism for Offshore and Inshore Ship Detection in SAR ImagesTianwen Zhang, Xiaoling Zhang, Jun Shi et al.
Huge imbalance of different scenes' sample numbers seriously reduces Synthetic Aperture Radar (SAR) ship detection accuracy. Thus, to solve this problem, this letter proposes a Balance Scene Learning Mechanism (BSLM) for offshore and inshore ship detection in SAR images.
CVMay 8, 2020
NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and ResultsAbdelrahman Abdelhamed, Mahmoud Afifi, Radu Timofte et al.
This paper reviews the NTIRE 2020 challenge on real image denoising with focus on the newly introduced dataset, the proposed methods and their results. The challenge is a new version of the previous NTIRE 2019 challenge on real image denoising that was based on the SIDD benchmark. This challenge is based on a newly collected validation and testing image datasets, and hence, named SIDD+. This challenge has two tracks for quantitatively evaluating image denoising performance in (1) the Bayer-pattern rawRGB and (2) the standard RGB (sRGB) color spaces. Each track ~250 registered participants. A total of 22 teams, proposing 24 methods, competed in the final phase of the challenge. The proposed methods by the participating teams represent the current state-of-the-art performance in image denoising targeting real noisy images. The newly collected SIDD+ datasets are publicly available at: https://bit.ly/siddplus_data.