Fangzhou Han

CV
h-index18
5papers
50citations
Novelty52%
AI Score36

5 Papers

CVAug 30, 2023
Background Debiased SAR Target Recognition via Causal Interventional Regularizer

Hongwei Dong, Fangzhou Han, Lingyu Si et al.

Recent studies have utilized deep learning (DL) techniques to automatically extract features from synthetic aperture radar (SAR) images, which shows great promise for enhancing the performance of SAR automatic target recognition (ATR). However, our research reveals a previously overlooked issue: SAR images to be recognized include not only the foreground (i.e., the target), but also a certain size of the background area. When a DL-model is trained exclusively on foreground data, its recognition performance is significantly superior to a model trained on original data that includes both foreground and background. This suggests that the presence of background impedes the ability of the DL-model to learn additional semantic information about the target. To address this issue, we construct a structural causal model (SCM) that incorporates the background as a confounder. Based on the constructed SCM, we propose a causal intervention based regularization method to eliminate the negative impact of background on feature semantic learning and achieve background debiased SAR-ATR. The proposed causal interventional regularizer can be integrated into any existing DL-based SAR-ATR models to mitigate the impact of background interference on the feature extraction and recognition accuracy. Experimental results on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset indicate that the proposed method can enhance the efficiency of existing DL-based methods in a plug-and-play manner.

CVMay 9, 2024Code
Exploring Text-Guided Single Image Editing for Remote Sensing Images

Fangzhou Han, Lingyu Si, Zhizhuo Jiang et al.

Artificial intelligence generative content (AIGC) has significantly impacted image generation in the field of remote sensing. However, the equally important area of remote sensing image (RSI) editing has not received sufficient attention. Deep learning based editing methods generally involve two sequential stages: generation and editing. For natural images, these stages primarily rely on generative backbones pre-trained on large-scale benchmark datasets and text guidance facilitated by vision-language models (VLMs). However, it become less viable for RSIs: First, existing generative RSI benchmark datasets do not fully capture the diversity of RSIs, and is often inadequate for universal editing tasks. Second, the single text semantic corresponds to multiple image semantics, leading to the introduction of incorrect semantics. To solve above problems, this paper proposes a text-guided RSI editing method and can be trained using only a single image. A multi-scale training approach is adopted to preserve consistency without the need for training on extensive benchmarks, while leveraging RSI pre-trained VLMs and prompt ensembling (PE) to ensure accuracy and controllability. Experimental results on multiple RSI editing tasks show that the proposed method offers significant advantages in both CLIP scores and subjective evaluations compared to existing methods. Additionally, we explore the ability of the edited RSIs to support disaster assessment tasks in order to validate their practicality. Codes will be released at https://github.com/HIT-PhilipHan/remote_sensing_image_editing.

IVJul 4, 2025
Towards Interpretable PolSAR Image Classification: Polarimetric Scattering Mechanism Informed Concept Bottleneck and Kolmogorov-Arnold Network

Jinqi Zhang, Fangzhou Han, Di Zhuang et al.

In recent years, Deep Learning (DL) based methods have received extensive and sufficient attention in the field of PolSAR image classification, which show excellent performance. However, due to the ``black-box" nature of DL methods, the interpretation of the high-dimensional features extracted and the backtracking of the decision-making process based on the features are still unresolved problems. In this study, we first highlight this issue and attempt to achieve the interpretability analysis of DL-based PolSAR image classification technology with the help of Polarimetric Target Decomposition (PTD), a feature extraction method related to the scattering mechanism unique to the PolSAR image processing field. In our work, by constructing the polarimetric conceptual labels and a novel structure named Parallel Concept Bottleneck Networks (PaCBM), the uninterpretable high-dimensional features are transformed into human-comprehensible concepts based on physically verifiable polarimetric scattering mechanisms. Then, the Kolmogorov-Arnold Network (KAN) is used to replace Multi-Layer Perceptron (MLP) for achieving a more concise and understandable mapping process between layers and further enhanced non-linear modeling ability. The experimental results on several PolSAR datasets show that the features could be conceptualization under the premise of achieving satisfactory accuracy through the proposed pipeline, and the analytical function for predicting category labels from conceptual labels can be obtained by combining spline functions, thus promoting the research on the interpretability of the DL-based PolSAR image classification model.

CVAug 4, 2021
Deep Portrait Lighting Enhancement with 3D Guidance

Fangzhou Han, Can Wang, Hao Du et al.

Despite recent breakthroughs in deep learning methods for image lighting enhancement, they are inferior when applied to portraits because 3D facial information is ignored in their models. To address this, we present a novel deep learning framework for portrait lighting enhancement based on 3D facial guidance. Our framework consists of two stages. In the first stage, corrected lighting parameters are predicted by a network from the input bad lighting image, with the assistance of a 3D morphable model and a differentiable renderer. Given the predicted lighting parameter, the differentiable renderer renders a face image with corrected shading and texture, which serves as the 3D guidance for learning image lighting enhancement in the second stage. To better exploit the long-range correlations between the input and the guidance, in the second stage, we design an image-to-image translation network with a novel transformer architecture, which automatically produces a lighting-enhanced result. Experimental results on the FFHQ dataset and in-the-wild images show that the proposed method outperforms state-of-the-art methods in terms of both quantitative metrics and visual quality. We will publish our dataset along with more results on https://cassiepython.github.io/egsr/index.html.

CVApr 29, 2021
Exemplar-Based 3D Portrait Stylization

Fangzhou Han, Shuquan Ye, Mingming He et al.

Exemplar-based portrait stylization is widely attractive and highly desired. Despite recent successes, it remains challenging, especially when considering both texture and geometric styles. In this paper, we present the first framework for one-shot 3D portrait style transfer, which can generate 3D face models with both the geometry exaggerated and the texture stylized while preserving the identity from the original content. It requires only one arbitrary style image instead of a large set of training examples for a particular style, provides geometry and texture outputs that are fully parameterized and disentangled, and enables further graphics applications with the 3D representations. The framework consists of two stages. In the first geometric style transfer stage, we use facial landmark translation to capture the coarse geometry style and guide the deformation of the dense 3D face geometry. In the second texture style transfer stage, we focus on performing style transfer on the canonical texture by adopting a differentiable renderer to optimize the texture in a multi-view framework. Experiments show that our method achieves robustly good results on different artistic styles and outperforms existing methods. We also demonstrate the advantages of our method via various 2D and 3D graphics applications. Project page is https://halfjoe.github.io/projs/3DPS/index.html.