Attention Mechanisms for Object Recognition with Event-Based Cameras
This work addresses object recognition challenges for event-based vision systems, offering incremental improvements in invariance.
The paper tackled object recognition with event-based cameras by proposing two attention models to locate regions of interest and improve invariance, resulting in improvements in translation and scale invariance on four datasets compared to a baseline model.
Event-based cameras are neuromorphic sensors capable of efficiently encoding visual information in the form of sparse sequences of events. Being biologically inspired, they are commonly used to exploit some of the computational and power consumption benefits of biological vision. In this paper we focus on a specific feature of vision: visual attention. We propose two attentive models for event based vision: an algorithm that tracks events activity within the field of view to locate regions of interest and a fully-differentiable attention procedure based on DRAW neural model. We highlight the strengths and weaknesses of the proposed methods on four datasets, the Shifted N-MNIST, Shifted MNIST-DVS, CIFAR10-DVS and N-Caltech101 collections, using the Phased LSTM recognition network as a baseline reference model obtaining improvements in terms of both translation and scale invariance.