Recurrent Soft Attention Model for Common Object Recognition
This work addresses object recognition for computer vision applications, but it appears incremental as it builds on existing attention and LSTM methods.
The authors tackled the problem of object detection and recognition by proposing a Recurrent Soft Attention Model that integrates visual attention with LSTM memory cells, achieving top-1 accuracy results on the CIFAR-10 dataset.
We propose the Recurrent Soft Attention Model, which integrates the visual attention from the original image to a LSTM memory cell through a down-sample network. The model recurrently transmits visual attention to the memory cells for glimpse mask generation, which is a more natural way for attention integration and exploitation in general object detection and recognition problem. We test our model under the metric of the top-1 accuracy on the CIFAR-10 dataset. The experiment shows that our down-sample network and feedback mechanism plays an effective role among the whole network structure.