v2e: From Video Frames to Realistic DVS Events
This provides a practical solution for researchers and developers needing DVS event data for training networks in uncontrolled lighting conditions, though it is incremental as it builds on existing synthetic event generation methods.
The paper introduces v2e, a toolbox for generating realistic synthetic DVS events from video frames to address the lack of event camera data, showing that pretraining with v2e events improves object recognition generalization on real DVS data and boosts car detection accuracy by 40% in night driving compared to intensity frame training.
To help meet the increasing need for dynamic vision sensor (DVS) event camera data, this paper proposes the v2e toolbox that generates realistic synthetic DVS events from intensity frames. It also clarifies incorrect claims about DVS motion blur and latency characteristics in recent literature. Unlike other toolboxes, v2e includes pixel-level Gaussian event threshold mismatch, finite intensity-dependent bandwidth, and intensity-dependent noise. Realistic DVS events are useful in training networks for uncontrolled lighting conditions. The use of v2e synthetic events is demonstrated in two experiments. The first experiment is object recognition with N-Caltech 101 dataset. Results show that pretraining on various v2e lighting conditions improves generalization when transferred on real DVS data for a ResNet model. The second experiment shows that for night driving, a car detector trained with v2e events shows an average accuracy improvement of 40% compared to the YOLOv3 trained on intensity frames.