Gaining Scale Invariance in UAV Bird's Eye View Object Detection by Adaptive Resizing
This addresses scale invariance challenges for UAV-based object detection applications, representing an incremental improvement with practical deployment benefits.
The paper tackles scale variance in UAV bird's eye view object detection by introducing Adaptive Resizing, a preprocessing step that improves inference speed by 2-3 times while achieving consistent performance gains across multiple datasets including UAVDT, VisDrone, and a new custom dataset.
This work introduces a new preprocessing step for object detection applicable to UAV bird's eye view imagery, which we call Adaptive Resizing. By design, it helps alleviate the challenges coming with the vast variances in objects' scales, naturally inherent to UAV data sets. Furthermore, it improves inference speed by two to three times on average. We test this extensively on UAVDT, VisDrone, and on a new data set we captured ourselves and achieve consistent improvements while being considerably faster. Moreover, we show how to apply this method to generic UAV object detection tasks. Additionally, we successfully test our approach on a height transfer task where we train on some interval of altitudes and test on a different one. Furthermore, we introduce a small, fast detector meant for deployment to an embedded GPU. Code will be made publicly available on our website.