Super-resolution of spatiotemporal event-stream image captured by the asynchronous temporal contrast vision sensor
This work addresses the problem of enhancing resolution for event-based vision sensors, enabling potential applications in this domain, but it is incremental as it adapts existing super-resolution concepts to a new sensor type.
The paper tackles super-resolution for spatiotemporal event-stream images from dynamic vision sensors, proposing a two-stage method that models events as a nonhomogeneous Poisson process and uses sparse representation and spatiotemporal filtering to generate high-resolution event sequences, resulting in upscaled images with higher spatial texture detail and well-matched temporal properties.
Super-resolution (SR) is a useful technology to generate a high-resolution (HR) visual output from the low-resolution (LR) visual inputs overcoming the physical limitations of the cameras. However, SR has not been applied to enhance the resolution of spatiotemporal event-stream images captured by the frame-free dynamic vision sensors (DVSs). SR of event-stream image is fundamentally different from existing frame-based schemes since basically each pixel value of DVS images is an event sequence. In this work, a two-stage scheme is proposed to solve the SR problem of the spatiotemporal event-stream image. We use a nonhomogeneous Poisson point process to model the event sequence, and sample the events of each pixel by simulating a nonhomogeneous Poisson process according to the specified event number and rate function. Firstly, the event number of each pixel of the HR DVS image is determined with a sparse signal representation based method to obtain the HR event-count map from that of the LR DVS recording. The rate function over time line of the point process of each HR pixel is computed using a spatiotemporal filter on the corresponding LR neighbor pixels. Secondly, the event sequence of each new pixel is generated with a thinning based event sampling algorithm. Two metrics are proposed to assess the event-stream SR results. The proposed method is demonstrated through obtaining HR event-stream images from a series of DVS recordings with the proposed method. Results show that the upscaled HR event streams has perceptually higher spatial texture detail than the LR DVS images. Besides, the temporal properties of the upscaled HR event streams match that of the original input very well. This work enables many potential applications of event-based vision.