CVJun 3

Tiny Collaborative Inference for Occlusion-Robust Object Detection

Chieh-Tung Cheng, Mustafa Aslanov, Eiman Kanjo

arXiv:2606.0289413.7h-index: 2

Predicted impact top 94% in CV · last 90 daysOriginality Incremental advance

AI Analysis

For search and rescue applications requiring occlusion-robust detection on severely resource-constrained edge devices, this paper provides a practical, low-overhead collaborative inference solution.

This work demonstrates that decision-level fusion (Weighted Boxes Fusion) improves occlusion-robust object detection on ultra-low-end edge devices (<1 MB SRAM), achieving up to +0.3827 mAP gain with three views and a 29.8% coverage gain in autonomous operation, while keeping communication overhead low (~1.3 KB per exchange).

Edge AI nodes for search and rescue are increasingly expected to run computer vision locally, yet ultra-low-end hardware imposes hard constraints on memory, compute, and inter-device communication. This work addresses occlusion-robust object detection on devices with less than 1 MB SRAM by combining an MCUNet backbone, a YOLOv2 detection head, and Lite quantisation. Two collaborative inference strategies are evaluated: feature-level fusion, concatenating intermediate feature maps, and decision-level fusion via Weighted Boxes Fusion (WBF). WBF outperforms feature-level fusion under all tested occlusion conditions, yielding gains of up to +0.2736 mAP in asymmetric scenarios. Extending fusion to three views improves accuracy further (up to +0.3827 mAP) at modest communication overhead (~1.3 KB per exchange). Hardware experiments progress from a host-assisted USB-relay baseline to a Wi-Fi peer-to-peer deployment on two Coral Dev Board Micro units, where WBF executes on-device with negligible communication energy relative to inference. In a 301.9 s autonomous session of 108 frames, fused output is produced on 61 frames versus 47 for a single board - a coverage gain of +29.8%. A decentralised federated learning feasibility note is included but not treated as a primary result, as performance remains limited under non-iid data. The results support decision-level fusion as a viable option for improving occlusion robustness in small-scale edge object detection, including host-free multi-board operation on ultra-low-end hardware.

View on arXiv PDF

Similar