Jingyi Yuan

CV
h-index11
3papers
10citations
Novelty50%
AI Score37

3 Papers

CVJun 25, 2023
A ground-based dataset and a diffusion model for on-orbit low-light image enhancement

Yiman Zhu, Lu Wang, Jingyi Yuan et al.

On-orbit service is important for maintaining the sustainability of space environment. Space-based visible camera is an economical and lightweight sensor for situation awareness during on-orbit service. However, it can be easily affected by the low illumination environment. Recently, deep learning has achieved remarkable success in image enhancement of natural images, but seldom applied in space due to the data bottleneck. In this article, we first propose a dataset of the Beidou Navigation Satellite for on-orbit low-light image enhancement (LLIE). In the automatic data collection scheme, we focus on reducing domain gap and improving the diversity of the dataset. we collect hardware in-the-loop images based on a robotic simulation testbed imitating space lighting conditions. To evenly sample poses of different orientation and distance without collision, a collision-free working space and pose stratified sampling is proposed. Afterwards, a novel diffusion model is proposed. To enhance the image contrast without over-exposure and blurring details, we design a fused attention to highlight the structure and dark region. Finally, we compare our method with previous methods using our dataset, which indicates that our method has a better capacity in on-orbit LLIE.

CVSep 17, 2025Code
AD-DINOv3: Enhancing DINOv3 for Zero-Shot Anomaly Detection with Anomaly-Aware Calibration

Jingyi Yuan, Jianxiong Ye, Wenkang Chen et al.

Zero-Shot Anomaly Detection (ZSAD) seeks to identify anomalies from arbitrary novel categories, offering a scalable and annotation-efficient solution. Traditionally, most ZSAD works have been based on the CLIP model, which performs anomaly detection by calculating the similarity between visual and text embeddings. Recently, vision foundation models such as DINOv3 have demonstrated strong transferable representation capabilities. In this work, we are the first to adapt DINOv3 for ZSAD. However, this adaptation presents two key challenges: (i) the domain bias between large-scale pretraining data and anomaly detection tasks leads to feature misalignment; and (ii) the inherent bias toward global semantics in pretrained representations often leads to subtle anomalies being misinterpreted as part of the normal foreground objects, rather than being distinguished as abnormal regions. To overcome these challenges, we introduce AD-DINOv3, a novel vision-language multimodal framework designed for ZSAD. Specifically, we formulate anomaly detection as a multimodal contrastive learning problem, where DINOv3 is employed as the visual backbone to extract patch tokens and a CLS token, and the CLIP text encoder provides embeddings for both normal and abnormal prompts. To bridge the domain gap, lightweight adapters are introduced in both modalities, enabling their representations to be recalibrated for the anomaly detection task. Beyond this baseline alignment, we further design an Anomaly-Aware Calibration Module (AACM), which explicitly guides the CLS token to attend to anomalous regions rather than generic foreground semantics, thereby enhancing discriminability. Extensive experiments on eight industrial and medical benchmarks demonstrate that AD-DINOv3 consistently matches or surpasses state-of-the-art methods.The code will be available at https://github.com/Kaisor-Yuan/AD-DINOv3.

CVMar 17, 2025
AFR-CLIP: Enhancing Zero-Shot Industrial Anomaly Detection with Stateless-to-Stateful Anomaly Feature Rectification

Jingyi Yuan, Chenqiang Gao, Pengyu Jie et al.

Recently, zero-shot anomaly detection (ZSAD) has emerged as a pivotal paradigm for industrial inspection and medical diagnostics, detecting defects in novel objects without requiring any target-dataset samples during training. Existing CLIP-based ZSAD methods generate anomaly maps by measuring the cosine similarity between visual and textual features. However, CLIP's alignment with object categories instead of their anomalous states limits its effectiveness for anomaly detection. To address this limitation, we propose AFR-CLIP, a CLIP-based anomaly feature rectification framework. AFR-CLIP first performs image-guided textual rectification, embedding the implicit defect information from the image into a stateless prompt that describes the object category without indicating any anomalous state. The enriched textual embeddings are then compared with two pre-defined stateful (normal or abnormal) embeddings, and their text-on-text similarity yields the anomaly map that highlights defective regions. To further enhance perception to multi-scale features and complex anomalies, we introduce self prompting (SP) and multi-patch feature aggregation (MPFA) modules. Extensive experiments are conducted on eleven anomaly detection benchmarks across industrial and medical domains, demonstrating AFR-CLIP's superiority in ZSAD.