Yuhang Gan

CV
h-index18
5papers
95citations
Novelty53%
AI Score45

5 Papers

CVApr 1, 2022Code
An End-to-end Supervised Domain Adaptation Framework for Cross-Domain Change Detection

Jia Liu, Wenjie Xuan, Yuhang Gan et al.

Existing deep learning-based change detection methods try to elaborately design complicated neural networks with powerful feature representations, but ignore the universal domain shift induced by time-varying land cover changes, including luminance fluctuations and season changes between pre-event and post-event images, thereby producing sub-optimal results. In this paper, we propose an end-to-end Supervised Domain Adaptation framework for cross-domain Change Detection, namely SDACD, to effectively alleviate the domain shift between bi-temporal images for better change predictions. Specifically, our SDACD presents collaborative adaptations from both image and feature perspectives with supervised learning. Image adaptation exploits generative adversarial learning with cycle-consistency constraints to perform cross-domain style transformation, effectively narrowing the domain gap in a two-side generation fashion. As to feature adaptation, we extract domain-invariant features to align different feature distributions in the feature space, which could further reduce the domain gap of cross-domain images. To further improve the performance, we combine three types of bi-temporal images for the final change prediction, including the initial input bi-temporal images and two generated bi-temporal images from the pre-event and post-event domains. Extensive experiments and analyses on two benchmarks demonstrate the effectiveness and universality of our proposed framework. Notably, our framework pushes several representative baseline models up to new State-Of-The-Art records, achieving 97.34% and 92.36% on the CDD and WHU building datasets, respectively. The source code and models are publicly available at https://github.com/Perfect-You/SDACD.

64.5DCApr 20
GPUOS: A GPU Operating System Primitive for Transparent Operation Fusion

Yiwei Yang, Xiangyu Gao, Yuan Zhou et al.

Modern deep learning workloads often consist of many small tensor operations, especially in inference, attention, and micro-batched training. In these settings, kernel launch overhead can become a major bottleneck, sometimes exceeding the actual computation time. We present GPUOS, a GPU runtime JIT system that reduces launch overhead using a persistent kernel architecture with runtime operator injection. GPUOS runs a single long-lived GPU kernel that continuously processes tasks from a host-managed work queue, eliminating repeated kernel launches. To support diverse operations, GPUOS uses NVIDIA NVRTC to just-in-time compile operators at runtime and inject them into the running kernel through device function pointer tables. This design enables operator updates without restarting the kernel or recompiling the system. GPUOS introduces four key ideas: (1) a persistent worker kernel with atomic task queues, (2) a runtime operator injection mechanism based on NVRTC and relocatable device code, (3) a dual-slot aliasing scheme for safe concurrent operator updates, and (4) transparent PyTorch integration through TorchDispatch that batches micro-operations into unified submissions. The system supports arbitrary tensor shapes, strides, data types, and broadcasting through a generic tensor abstraction. Experiments show that GPUOS achieves up to 15.3x speedup over standard PyTorch on workloads dominated by small operations, including micro-batched inference and attention patterns. GPUOS improves utilization while remaining compatible with the PyTorch ecosystem.

CVApr 27, 2024Code
RFL-CDNet: Towards Accurate Change Detection via Richer Feature Learning

Yuhang Gan, Wenjie Xuan, Hang Chen et al.

Change Detection is a crucial but extremely challenging task of remote sensing image analysis, and much progress has been made with the rapid development of deep learning. However, most existing deep learning-based change detection methods mainly focus on intricate feature extraction and multi-scale feature fusion, while ignoring the insufficient utilization of features in the intermediate stages, thus resulting in sub-optimal results. To this end, we propose a novel framework, named RFL-CDNet, that utilizes richer feature learning to boost change detection performance. Specifically, we first introduce deep multiple supervision to enhance intermediate representations, thus unleashing the potential of backbone feature extractor at each stage. Furthermore, we design the Coarse-To-Fine Guiding (C2FG) module and the Learnable Fusion (LF) module to further improve feature learning and obtain more discriminative feature representations. The C2FG module aims to seamlessly integrate the side prediction from the previous coarse-scale into the current fine-scale prediction in a coarse-to-fine manner, while LF module assumes that the contribution of each stage and each spatial location is independent, thus designing a learnable module to fuse multiple predictions. Experiments on several benchmark datasets show that our proposed RFL-CDNet achieves state-of-the-art performance on WHU cultivated land dataset and CDD dataset, and the second-best performance on WHU building dataset. The source code and models are publicly available at https://github.com/Hhaizee/RFL-CDNet.

CVDec 22, 2024Code
Detect Changes like Humans: Incorporating Semantic Priors for Improved Change Detection

Yuhang Gan, Wenjie Xuan, Zhiming Luo et al.

When given two similar images, humans identify their differences by comparing the appearance (e.g., color, texture) with the help of semantics (e.g., objects, relations). However, mainstream binary change detection models adopt a supervised training paradigm, where the annotated binary change map is the main constraint. Thus, such methods primarily emphasize difference-aware features between bi-temporal images, and the semantic understanding of changed landscapes is undermined, resulting in limited accuracy in the face of noise and illumination variations. To this end, this paper explores incorporating semantic priors from visual foundation models to improve the ability to detect changes. Firstly, we propose a Semantic-Aware Change Detection network (SA-CDNet), which transfers the knowledge of visual foundation models (i.e., FastSAM) to change detection. Inspired by the human visual paradigm, a novel dual-stream feature decoder is derived to distinguish changes by combining semantic-aware features and difference-aware features. Secondly, we explore a single-temporal pre-training strategy for better adaptation of visual foundation models. With pseudo-change data constructed from single-temporal segmentation datasets, we employ an extra branch of proxy semantic segmentation task for pre-training. We explore various settings like dataset combinations and landscape types, thus providing valuable insights. Experimental results on five challenging benchmarks demonstrate the superiority of our method over the existing state-of-the-art methods. The code is available at $\href{https://github.com/DREAMXFAR/SA-CDNet}{github}$.

QUANT-PHNov 22, 2024
Scalable Community Detection Using Quantum Hamiltonian Descent and QUBO Formulation

Jinglei Cheng, Ruilin Zhou, Yuhang Gan et al.

We present a quantum-inspired algorithm that utilizes Quantum Hamiltonian Descent (QHD) for efficient community detection. Our approach reformulates the community detection task as a Quadratic Unconstrained Binary Optimization (QUBO) problem, and QHD is deployed to identify optimal community structures. We implement a multi-level algorithm that iteratively refines community assignments by alternating between QUBO problem setup and QHD-based optimization. Benchmarking shows our method achieves up to 5.49\% better modularity scores while requiring less computational time compared to classical optimization approaches. This work demonstrates the potential of hybrid quantum-inspired solutions for advancing community detection in large-scale graph data.