Jing Zhang

h-index29

4papers

34citations

Novelty45%

AI Score36

Ranked #97,673 of 194,257 authors (top 50%)#32,808 in CV (top 55%)

4 Papers

6.8CVNov 22, 2023Code

Unified Domain Adaptive Semantic Segmentation

Zhe Zhang, Gaochang Wu, Jing Zhang et al.

Unsupervised Domain Adaptive Semantic Segmentation (UDA-SS) aims to transfer the supervision from a labeled source domain to an unlabeled target domain. The majority of existing UDA-SS works typically consider images whilst recent attempts have extended further to tackle videos by modeling the temporal dimension. Although the two lines of research share the major challenges -- overcoming the underlying domain distribution shift, their studies are largely independent, resulting in fragmented insights, a lack of holistic understanding, and missed opportunities for cross-pollination of ideas. This fragmentation prevents the unification of methods, leading to redundant efforts and suboptimal knowledge transfer across image and video domains. Under this observation, we advocate unifying the study of UDA-SS across video and image scenarios, enabling a more comprehensive understanding, synergistic advancements, and efficient knowledge sharing. To that end, we explore the unified UDA-SS from a general data augmentation perspective, serving as a unifying conceptual framework, enabling improved generalization, and potential for cross-pollination of ideas, ultimately contributing to the overall progress and practical impact of this field of research. Specifically, we propose a Quad-directional Mixup (QuadMix) method, characterized by tackling distinct point attributes and feature inconsistencies through four-directional paths for intra- and inter-domain mixing in a feature space. To deal with temporal shifts with videos, we incorporate optical flow-guided feature aggregation across spatial and temporal dimensions for fine-grained domain alignment. Extensive experiments show that our method outperforms the state-of-the-art works by large margins on four challenging UDA-SS benchmarks. Our source code and models will be released at https://github.com/ZHE-SAPI/UDASS.

4.6LGMar 18, 2024Code

Open-World Semi-Supervised Learning for Node Classification

Yanling Wang, Jing Zhang, Lingxi Zhang et al.

Open-world semi-supervised learning (Open-world SSL) for node classification, that classifies unlabeled nodes into seen classes or multiple novel classes, is a practical but under-explored problem in the graph community. As only seen classes have human labels, they are usually better learned than novel classes, and thus exhibit smaller intra-class variances within the embedding space (named as imbalance of intra-class variances between seen and novel classes). Based on empirical and theoretical analysis, we find the variance imbalance can negatively impact the model performance. Pre-trained feature encoders can alleviate this issue via producing compact representations for novel classes. However, creating general pre-trained encoders for various types of graph data has been proven to be challenging. As such, there is a demand for an effective method that does not rely on pre-trained graph encoders. In this paper, we propose an IMbalance-Aware method named OpenIMA for Open-world semi-supervised node classification, which trains the node classification model from scratch via contrastive learning with bias-reduced pseudo labels. Extensive experiments on seven popular graph benchmarks demonstrate the effectiveness of OpenIMA, and the source code has been available on GitHub.

14.4CVJul 10, 2025

Synergistic Prompting for Robust Visual Recognition with Missing Modalities

Zhihui Zhang, Luanyuan Dai, Qika Lin et al.

Large-scale multi-modal models have demonstrated remarkable performance across various visual recognition tasks by leveraging extensive paired multi-modal training data. However, in real-world applications, the presence of missing or incomplete modality inputs often leads to significant performance degradation. Recent research has focused on prompt-based strategies to tackle this issue; however, existing methods are hindered by two major limitations: (1) static prompts lack the flexibility to adapt to varying missing-data conditions, and (2) basic prompt-tuning methods struggle to ensure reliable performance when critical modalities are missing.To address these challenges, we propose a novel Synergistic Prompting (SyP) framework for robust visual recognition with missing modalities. The proposed SyP introduces two key innovations: (I) a Dynamic Adapter, which computes adaptive scaling factors to dynamically generate prompts, replacing static parameters for flexible multi-modal adaptation, and (II) a Synergistic Prompting Strategy, which combines static and dynamic prompts to balance information across modalities, ensuring robust reasoning even when key modalities are missing. The proposed SyP achieves significant performance improvements over existing approaches across three widely-used visual recognition datasets, demonstrating robustness under diverse missing rates and conditions. Extensive experiments and ablation studies validate its effectiveness in handling missing modalities, highlighting its superior adaptability and reliability.

2.6CVMay 27, 2019

Computer-aided Detection of Squamous Carcinoma of the Cervix in Whole Slide Images

Ye Tian, Li Yang, Wei Wang et al.

Goal: Squamous cell carcinoma of cervix is one of the most prevalent cancer worldwide in females. Traditionally, the most indispensable diagnosis of cervix squamous carcinoma is histopathological assessment which is achieved under microscope by pathologist. However, human evaluation of pathology slide is highly depending on the experience of pathologist, thus big inter- and intra-observer variability exists. Digital pathology, in combination with deep learning provides an opportunity to improve the objectivity and efficiency of histopathologic slide analysis. Methods: In this study, we obtained 800 haematoxylin and eosin stained slides from 300 patients suffered from cervix squamous carcinoma. Based on information from morphological heterogeneity in the tumor and its adjacent area, we established deep learning models using popular convolution neural network architectures (inception-v3, InceptionResnet-v2 and Resnet50). Then random forest was introduced to feature extractions and slide-based classification. Results: The overall performance of our proposed models on slide-based tumor discrimination were outstanding with an AUC scores > 0.94. While, location identifications of lesions in whole slide images were mediocre (FROC scores > 0.52) duo to the extreme complexity of tumor tissues. Conclusion: For the first time, our analysis workflow highlighted a quantitative visual-based slide analysis of cervix squamous carcinoma. Significance: This study demonstrates a pathway to assist pathologist and accelerate the diagnosis of patients by utilizing new computational approaches.