CVIVMay 31, 2020

EBBINNOT: A Hardware Efficient Hybrid Event-Frame Tracker for Stationary Dynamic Vision Sensors

arXiv:2006.00422v4
Originality Incremental advance
AI Analysis

This work addresses low-power, battery-powered IoT applications for traffic monitoring, offering an incremental improvement over existing methods.

The paper tackles object detection and tracking for stationary dynamic vision sensors (DVS) in traffic monitoring, proposing a hybrid event-frame approach that reduces computational needs by approximately 6 times while achieving similar accuracy to existing event-based trackers.

As an alternative sensing paradigm, dynamic vision sensors (DVS) have been recently explored to tackle scenarios where conventional sensors result in high data rate and processing time. This paper presents a hybrid event-frame approach for detecting and tracking objects recorded by a stationary neuromorphic sensor, thereby exploiting the sparse DVS output in a low-power setting for traffic monitoring. Specifically, we propose a hardware efficient processing pipeline that optimizes memory and computational needs that enable long-term battery powered usage for IoT applications. To exploit the background removal property of a static DVS, we propose an event-based binary image creation that signals presence or absence of events in a frame duration. This reduces memory requirement and enables usage of simple algorithms like median filtering and connected component labeling for denoise and region proposal respectively. To overcome the fragmentation issue, a YOLO inspired neural network based detector and classifier to merge fragmented region proposals has been proposed. Finally, a new overlap based tracker was implemented, exploiting overlap between detections and tracks is proposed with heuristics to overcome occlusion. The proposed pipeline is evaluated with more than 5 hours of traffic recording spanning three different locations on two different neuromorphic sensors (DVS and CeleX) and demonstrate similar performance. Compared to existing event-based feature trackers, our method provides similar accuracy while needing approx 6 times less computes. To the best of our knowledge, this is the first time a stationary DVS based traffic monitoring solution is extensively compared to simultaneously recorded RGB frame-based methods while showing tremendous promise by outperforming state-of-the-art deep learning solutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes