Accelerated Event-Based Feature Detection and Compression for Surveillance Video Systems
This work addresses the problem of high data rates in surveillance video processing for downstream vision applications, though it is incremental as it builds on existing frameworks like ADDER.
The paper tackles the inefficiency of processing decoded image frames in surveillance video by proposing a system that converts videos to sparse asynchronous events, achieving a median 43.7% speed improvement in FAST feature detection compared to OpenCV.
The strong temporal consistency of surveillance video enables compelling compression performance with traditional methods, but downstream vision applications operate on decoded image frames with a high data rate. Since it is not straightforward for applications to extract information on temporal redundancy from the compressed video representations, we propose a novel system which conveys temporal redundancy within a sparse decompressed representation. We leverage a video representation framework called ADDER to transcode framed videos to sparse, asynchronous intensity samples. We introduce mechanisms for content adaptation, lossy compression, and asynchronous forms of classical vision algorithms. We evaluate our system on the VIRAT surveillance video dataset, and we show a median 43.7% speed improvement in FAST feature detection compared to OpenCV. We run the same algorithm as OpenCV, but only process pixels that receive new asynchronous events, rather than process every pixel in an image frame. Our work paves the way for upcoming neuromorphic sensors and is amenable to future applications with spiking neural networks.