Haochen Jiang

h-index5

3papers

9citations

Novelty50%

AI Score41

Ranked #66,879 of 194,257 authors (top 34%)#22,836 in CV (top 39%)

3 Papers

1.5CVJan 7

CroBIM-U: Uncertainty-Driven Referring Remote Sensing Image Segmentation

Yuzhe Sun, Zhe Dong, Haochen Jiang et al.

Referring remote sensing image segmentation aims to localize specific targets described by natural language within complex overhead imagery. However, due to extreme scale variations, dense similar distractors, and intricate boundary structures, the reliability of cross-modal alignment exhibits significant \textbf{spatial non-uniformity}. Existing methods typically employ uniform fusion and refinement strategies across the entire image, which often introduces unnecessary linguistic perturbations in visually clear regions while failing to provide sufficient disambiguation in confused areas. To address this, we propose an \textbf{uncertainty-guided framework} that explicitly leverages a pixel-wise \textbf{referring uncertainty map} as a spatial prior to orchestrate adaptive inference. Specifically, we introduce a plug-and-play \textbf{Referring Uncertainty Scorer (RUS)}, which is trained via an online error-consistency supervision strategy to interpretably predict the spatial distribution of referential ambiguity. Building on this prior, we design two plug-and-play modules: 1) \textbf{Uncertainty-Gated Fusion (UGF)}, which dynamically modulates language injection strength to enhance constraints in high-uncertainty regions while suppressing noise in low-uncertainty ones; and 2) \textbf{Uncertainty-Driven Local Refinement (UDLR)}, which utilizes uncertainty-derived soft masks to focus refinement on error-prone boundaries and fine details. Extensive experiments demonstrate that our method functions as a unified, plug-and-play solution that significantly improves robustness and geometric fidelity in complex remote sensing scenes without altering the backbone architecture.

10.2CVOct 9, 2025Code

PhyDAE: Physics-Guided Degradation-Adaptive Experts for All-in-One Remote Sensing Image Restoration

Zhe Dong, Yuzhe Sun, Haochen Jiang et al.

Remote sensing images inevitably suffer from various degradation factors during acquisition, including atmospheric interference, sensor limitations, and imaging conditions. These complex and heterogeneous degradations pose severe challenges to image quality and downstream interpretation tasks. Addressing limitations of existing all-in-one restoration methods that overly rely on implicit feature representations and lack explicit modeling of degradation physics, this paper proposes Physics-Guided Degradation-Adaptive Experts (PhyDAE). The method employs a two-stage cascaded architecture transforming degradation information from implicit features into explicit decision signals, enabling precise identification and differentiated processing of multiple heterogeneous degradations including haze, noise, blur, and low-light conditions. The model incorporates progressive degradation mining and exploitation mechanisms, where the Residual Manifold Projector (RMP) and Frequency-Aware Degradation Decomposer (FADD) comprehensively analyze degradation characteristics from manifold geometry and frequency perspectives. Physics-aware expert modules and temperature-controlled sparse activation strategies are introduced to enhance computational efficiency while ensuring imaging physics consistency. Extensive experiments on three benchmark datasets (MD-RSID, MD-RRSHID, and MDRS-Landsat) demonstrate that PhyDAE achieves superior performance across all four restoration tasks, comprehensively outperforming state-of-the-art methods. Notably, PhyDAE substantially improves restoration quality while achieving significant reductions in parameter count and computational complexity, resulting in remarkable efficiency gains compared to mainstream approaches and achieving optimal balance between performance and efficiency. Code is available at https://github.com/HIT-SIRS/PhyDAE.

7.6CVMar 18, 2024Code

OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation

Haochen Jiang, Yueming Xu, Yihan Zeng et al.

3D reconstruction has been widely used in autonomous navigation fields of mobile robotics. However, the former research can only provide the basic geometry structure without the capability of open-world scene understanding, limiting advanced tasks like human interaction and visual navigation. Moreover, traditional 3D scene understanding approaches rely on expensive labeled 3D datasets to train a model for a single task with supervision. Thus, geometric reconstruction with zero-shot scene understanding i.e. Open vocabulary 3D Understanding and Reconstruction, is crucial for the future development of mobile robots. In this paper, we propose OpenOcc, a novel framework unifying the 3D scene reconstruction and open vocabulary understanding with neural radiance fields. We model the geometric structure of the scene with occupancy representation and distill the pre-trained open vocabulary model into a 3D language field via volume rendering for zero-shot inference. Furthermore, a novel semantic-aware confidence propagation (SCP) method has been proposed to relieve the issue of language field representation degeneracy caused by inconsistent measurements in distilled features. Experimental results show that our approach achieves competitive performance in 3D scene understanding tasks, especially for small and long-tail objects.