CVAISep 2, 2024

DS MYOLO: A Reliable Object Detector Based on SSMs for Driving Scenarios

arXiv:2409.01093v14 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses the need for efficient and accurate object detection in advanced driver-assistance systems, though it appears incremental as it builds on existing YOLO and Mamba architectures.

The authors tackled the problem of real-time object detection for driving scenarios by proposing DS MYOLO, a YOLO-based detector that uses simplified selective scanning fusion blocks and efficient channel attention convolution to capture global features with linear complexity, achieving competitive performance on CCTSDB 2021 and VLD-45 datasets.

Accurate real-time object detection enhances the safety of advanced driver-assistance systems, making it an essential component in driving scenarios. With the rapid development of deep learning technology, CNN-based YOLO real-time object detectors have gained significant attention. However, the local focus of CNNs results in performance bottlenecks. To further enhance detector performance, researchers have introduced Transformer-based self-attention mechanisms to leverage global receptive fields, but their quadratic complexity incurs substantial computational costs. Recently, Mamba, with its linear complexity, has made significant progress through global selective scanning. Inspired by Mamba's outstanding performance, we propose a novel object detector: DS MYOLO. This detector captures global feature information through a simplified selective scanning fusion block (SimVSS Block) and effectively integrates the network's deep features. Additionally, we introduce an efficient channel attention convolution (ECAConv) that enhances cross-channel feature interaction while maintaining low computational complexity. Extensive experiments on the CCTSDB 2021 and VLD-45 driving scenarios datasets demonstrate that DS MYOLO exhibits significant potential and competitive advantage among similarly scaled YOLO series real-time object detectors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes