Yimeng Fan

CV
h-index13
6papers
39citations
Novelty46%
AI Score42

6 Papers

CVSep 6, 2024
Cycle Pixel Difference Network for Crisp Edge Detection

Changsong Liu, Wei Zhang, Yanyan Liu et al.

Edge detection, as a fundamental task in computer vision, has garnered increasing attention. The advent of deep learning has significantly advanced this field. However, recent deep learning-based methods generally face two significant issues: 1) reliance on large-scale pre-trained weights, and 2) generation of thick edges. We construct a U-shape encoder-decoder model named CPD-Net that successfully addresses these two issues simultaneously. In response to issue 1), we propose a novel cycle pixel difference convolution (CPDC), which effectively integrates edge prior knowledge with modern convolution operations, consequently successfully eliminating the dependence on large-scale pre-trained weights. As for issue 2), we construct a multi-scale information enhancement module (MSEM) and a dual residual connection-based (DRC) decoder to enhance the edge location ability of the model, thereby generating crisp and clean contour maps. Comprehensive experiments conducted on four standard benchmarks demonstrate that our method achieves competitive performance on the BSDS500 dataset (ODS=0.813 and AC=0.352), NYUD-V2 (ODS=0.760 and AC=0.223), BIPED dataset (ODS=0.898 and AC=0.426), and CID (ODS=0.59). Our approach provides a novel perspective for addressing these challenges in edge detection.

CVMar 22, 2024Code
SFOD: Spiking Fusion Object Detector

Yimeng Fan, Wei Zhang, Changsong Liu et al.

Event cameras, characterized by high temporal resolution, high dynamic range, low power consumption, and high pixel bandwidth, offer unique capabilities for object detection in specialized contexts. Despite these advantages, the inherent sparsity and asynchrony of event data pose challenges to existing object detection algorithms. Spiking Neural Networks (SNNs), inspired by the way the human brain codes and processes information, offer a potential solution to these difficulties. However, their performance in object detection using event cameras is limited in current implementations. In this paper, we propose the Spiking Fusion Object Detector (SFOD), a simple and efficient approach to SNN-based object detection. Specifically, we design a Spiking Fusion Module, achieving the first-time fusion of feature maps from different scales in SNNs applied to event cameras. Additionally, through integrating our analysis and experiments conducted during the pretraining of the backbone network on the NCAR dataset, we delve deeply into the impact of spiking decoding strategies and loss functions on model performance. Thereby, we establish state-of-the-art classification results based on SNNs, achieving 93.7\% accuracy on the NCAR dataset. Experimental results on the GEN1 detection dataset demonstrate that the SFOD achieves a state-of-the-art mAP of 32.1\%, outperforming existing SNN-based approaches. Our research not only underscores the potential of SNNs in object detection with event cameras but also propels the advancement of SNNs. Code is available at https://github.com/yimeng-fan/SFOD.

LGMar 20, 2024Code
Sequential Modeling of Complex Marine Navigation: Case Study on a Passenger Vessel (Student Abstract)

Yimeng Fan, Pedram Agand, Mo Chen et al.

The maritime industry's continuous commitment to sustainability has led to a dedicated exploration of methods to reduce vessel fuel consumption. This paper undertakes this challenge through a machine learning approach, leveraging a real-world dataset spanning two years of a ferry in west coast Canada. Our focus centers on the creation of a time series forecasting model given the dynamic and static states, actions, and disturbances. This model is designed to predict dynamic states based on the actions provided, subsequently serving as an evaluative tool to assess the proficiency of the ferry's operation under the captain's guidance. Additionally, it lays the foundation for future optimization algorithms, providing valuable feedback on decision-making processes. To facilitate future studies, our code is available at \url{https://github.com/pagand/model_optimze_vessel/tree/AAAI}

CVJan 25, 2025
SpikeDet: Better Firing Patterns for Accurate and Energy-Efficient Object Detection with Spiking Neuron Networks

Yimeng Fan, Changsong Liu, Mingyang Li et al.

Spiking Neural Networks (SNNs) are the third generation of neural networks. They have gained widespread attention in object detection due to their low power consumption and biological interpretability. However, existing SNN-based object detection methods suffer from local firing saturation, where neurons in information-concentrated regions fire continuously throughout all time steps. This abnormal neuron firing pattern reduces the feature discrimination capability and detection accuracy, while also increasing the firing rates that prevent SNNs from achieving their potential energy efficiency. To address this problem, we propose SpikeDet, a novel spiking object detector that optimizes firing patterns for accurate and energy-efficient detection. Specifically, we design a spiking backbone network, MDSNet, which effectively adjusts the membrane synaptic input distribution at each layer, achieving better neuron firing patterns during spiking feature extraction. Additionally, to better utilize and preserve these high-quality backbone features, we introduce the Spiking Multi-direction Fusion Module (SMFM), which realizes multi-direction fusion of spiking features, enhancing the multi-scale detection capability of the model. Experimental results demonstrate that SpikeDet achieves superior performance. On the COCO 2017 dataset, it achieves 51.4% AP, outperforming previous SNN-based methods by 2.5% AP while requiring only half the power consumption. On object detection sub-tasks, including the GEN1 event-based dataset and the URPC 2019 underwater dataset, SpikeDet also achieves the best performance. Notably, on GEN1, our method achieves 47.6% AP, outperforming previous SNN-based methods by 7.2% AP with better energy efficiency.

CVSep 30, 2025
Image-Plane Geometric Decoding for View-Invariant Indoor Scene Reconstruction

Mingyang Li, Yimeng Fan, Changsong Liu et al.

Volume-based indoor scene reconstruction methods offer superior generalization capability and real-time deployment potential. However, existing methods rely on multi-view pixel back-projection ray intersections as weak geometric constraints to determine spatial positions. This dependence results in reconstruction quality being heavily influenced by input view density. Performance degrades in overlapping regions and unobserved areas.To address these limitations, we reduce dependency on inter-view geometric constraints by exploiting spatial information within individual views. We propose an image-plane decoding framework with three core components: Pixel-level Confidence Encoder, Affine Compensation Module, and Image-Plane Spatial Decoder. These modules decode three-dimensional structural information encoded in images through physical imaging processes. The framework effectively preserves spatial geometric features including edges, hollow structures, and complex textures. It significantly enhances view-invariant reconstruction.Experiments on indoor scene reconstruction datasets confirm superior reconstruction stability. Our method maintains nearly identical quality when view count reduces by 40%. It achieves a coefficient of variation of 0.24%, performance retention rate of 99.7%, and maximum performance drop of 0.42%. These results demonstrate that exploiting intra-view spatial information provides a robust solution for view-limited scenarios in practical applications.

CVJun 9, 2024
Learning to utilize image second-order derivative information for crisp edge detection

Changsong Liu, Yimeng Fan, Mingyang Li et al.

Edge detection is a fundamental task in computer vision. It has made great progress under the development of deep convolutional neural networks (DCNNs), some of which have achieved a beyond human-level performance. However, recent top-performing edge detection methods tend to generate thick and noisy edge lines. In this work, we solve this problem from two aspects: (1) the lack of prior knowledge regarding image edges, and (2) the issue of imbalanced pixel distribution. We propose a second-order derivative-based multi-scale contextual enhancement module (SDMCM) to help the model locate true edge pixels accurately by introducing the edge prior knowledge. We also construct a hybrid focal loss function (HFL) to alleviate the imbalanced distribution issue. In addition, we employ the conditionally parameterized convolution (CondConv) to develop a novel boundary refinement module (BRM), which can further refine the final output edge maps. In the end, we propose a U-shape network named LUS-Net which is based on the SDMCM and BRM for crisp edge detection. We perform extensive experiments on three standard benchmarks, and the experiment results illustrate that our method can predict crisp and clean edge maps and achieves state-of-the-art performance on the BSDS500 dataset (ODS=0.829), NYUD-V2 dataset (ODS=0.768), and BIPED dataset (ODS=0.903).