Xu Zhang

h-index11

3papers

632citations

Novelty50%

AI Score33

Ranked #121,441 of 194,257 authors (top 63%)#40,368 in CV (top 68%)

3 Papers

3.7CVOct 13, 2024

EMWaveNet: Physically Explainable Neural Network Based on Electromagnetic Propagation for SAR Target Recognition

Zhuoxuan Li, Xu Zhang, Shumeng Yu et al.

Deep learning technologies have significantly improved performance in the field of synthetic aperture radar (SAR) image target recognition compared to traditional methods. However, the inherent ``black box" property of deep learning models leads to a lack of transparency in decision-making processes, making them difficult to be widespread applied in practice. To tackle this issue, this study proposes a physically explainable framework for complex-valued SAR image recognition, designed based on the physical process of microwave propagation. This framework utilizes complex-valued SAR data to explore the amplitude and phase information and its intrinsic physical properties. The network architecture is fully parameterized, with all learnable parameters endowed with clear physical meanings. Experiments on both the complex-valued MSTAR dataset and a self-built Qilu-1 complex-valued dataset were conducted to validate the effectiveness of framework. The de-overlapping capability of EMWaveNet enables accurate recognition of overlapping target categories, whereas other models are nearly incapable of performing such recognition. Against 0dB forest background noise, it boasts a 20\% accuracy improvement over traditional neural networks. When targets are 60\% masked by noise, it still outperforms other models by 9\%. An end-to-end complex-valued synthetic aperture radar automatic target recognition (SAR-ATR) algorithm is constructed to perform recognition tasks in interference SAR scenarios. The results demonstrate that the proposed method possesses a strong physical decision logic, high physical explainability and robustness, as well as excellent de-aliasing capabilities. Finally, a perspective on future applications is provided.

0.3CLMar 30, 2020

Learning Contextualized Sentence Representations for Document-Level Neural Machine Translation

Pei Zhang, Xu Zhang, Wei Chen et al.

Document-level machine translation incorporates inter-sentential dependencies into the translation of a source sentence. In this paper, we propose a new framework to model cross-sentence dependencies by training neural machine translation (NMT) to predict both the target translation and surrounding sentences of a source sentence. By enforcing the NMT model to predict source context, we want the model to learn "contextualized" source sentence representations that capture document-level dependencies on the source side. We further propose two different methods to learn and integrate such contextualized sentence embeddings into NMT: a joint training method that jointly trains an NMT model with the source context prediction model and a pre-training & fine-tuning method that pretrains the source context prediction model on a large-scale monolingual document corpus and then fine-tunes it with the NMT model. Experiments on Chinese-English and English-German translation show that both methods can substantially improve the translation quality over a strong document-level Transformer baseline.

35.2CVJul 15, 2019Code

Detecting and Simulating Artifacts in GAN Fake Images

Xu Zhang, Svebor Karaman, Shih-Fu Chang

To detect GAN generated images, conventional supervised machine learning algorithms require collection of a number of real and fake images from the targeted GAN model. However, the specific model used by the attacker is often unavailable. To address this, we propose a GAN simulator, AutoGAN, which can simulate the artifacts produced by the common pipeline shared by several popular GAN models. Additionally, we identify a unique artifact caused by the up-sampling component included in the common GAN pipeline. We show theoretically such artifacts are manifested as replications of spectra in the frequency domain and thus propose a classifier model based on the spectrum input, rather than the pixel input. By using the simulated images to train a spectrum based classifier, even without seeing the fake images produced by the targeted GAN model during training, our approach achieves state-of-the-art performances on detecting fake images generated by popular GAN models such as CycleGAN.