CVLGJun 9, 2025

CBAM-STN-TPS-YOLO: Enhancing Agricultural Object Detection through Spatially Adaptive Attention Mechanisms

arXiv:2506.07357v14 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work addresses detection challenges in smart farming, offering a lightweight model for real-time edge deployment, but it is incremental as it builds on existing YOLO and STN frameworks with specific enhancements.

The paper tackled the problem of object detection in precision agriculture, where models like YOLO struggle with occlusions and irregular structures, by proposing CBAM-STN-TPS-YOLO, which integrates Thin-Plate Splines into Spatial Transformer Networks for flexible transformations and uses attention mechanisms to suppress noise, resulting in a 12% reduction in false positives on the PGP dataset.

Object detection is vital in precision agriculture for plant monitoring, disease detection, and yield estimation. However, models like YOLO struggle with occlusions, irregular structures, and background noise, reducing detection accuracy. While Spatial Transformer Networks (STNs) improve spatial invariance through learned transformations, affine mappings are insufficient for non-rigid deformations such as bent leaves and overlaps. We propose CBAM-STN-TPS-YOLO, a model integrating Thin-Plate Splines (TPS) into STNs for flexible, non-rigid spatial transformations that better align features. Performance is further enhanced by the Convolutional Block Attention Module (CBAM), which suppresses background noise and emphasizes relevant spatial and channel-wise features. On the occlusion-heavy Plant Growth and Phenotyping (PGP) dataset, our model outperforms STN-YOLO in precision, recall, and mAP. It achieves a 12% reduction in false positives, highlighting the benefits of improved spatial flexibility and attention-guided refinement. We also examine the impact of the TPS regularization parameter in balancing transformation smoothness and detection performance. This lightweight model improves spatial awareness and supports real-time edge deployment, making it ideal for smart farming applications requiring accurate and efficient monitoring.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes