EGD-YOLO: A Lightweight Multimodal Framework for Robust Drone-Bird Discrimination via Ghost-Enhanced YOLOv8n and EMA Attention under Adverse Condition
This work addresses safety and security in drone detection, but it is incremental as it builds on existing YOLO methods with modifications for multimodal data.
The study tackled the problem of distinguishing drones from birds in adverse conditions by developing EGD-YOLOv8n, a lightweight multimodal framework using RGB and infrared images, which achieved the best accuracy and reliability for real-time use on common GPUs.
Identifying drones and birds correctly is essential for keeping the skies safe and improving security systems. Using the VIP CUP 2025 dataset, which provides both RGB and infrared (IR) images, this study presents EGD-YOLOv8n, a new lightweight yet powerful model for object detection. The model improves how image features are captured and understood, making detection more accurate and efficient. It uses smart design changes and attention layers to focus on important details while reducing the amount of computation needed. A special detection head helps the model adapt to objects of different shapes and sizes. We trained three versions: one using RGB images, one using IR images, and one combining both. The combined model achieved the best accuracy and reliability while running fast enough for real-time use on common GPUs.