CVDec 19, 2017

Learning Fixation Point Strategy for Object Detection and Classification

arXiv:1712.06897v10.91 citations

Originality Incremental advance

AI Analysis

This work addresses object detection and classification for computer vision applications, presenting an incremental improvement with a novel attentional strategy.

The authors tackled object detection and classification by proposing a recurrent attentional structure that learns to extract sequences of local observations, achieving high speed on large images without pooling operations and enabling precision and speed improvements by adjusting recurrent steps.

We propose a novel recurrent attentional structure to localize and recognize objects jointly. The network can learn to extract a sequence of local observations with detailed appearance and rough context, instead of sliding windows or convolutions on the entire image. Meanwhile, those observations are fused to complete detection and classification tasks. On training, we present a hybrid loss function to learn the parameters of the multi-task network end-to-end. Particularly, the combination of stochastic and object-awareness strategy, named SA, can select more abundant context and ensure the last fixation close to the object. In addition, we build a real-world dataset to verify the capacity of our method in detecting the object of interest including those small ones. Our method can predict a precise bounding box on an image, and achieve high speed on large images without pooling operations. Experimental results indicate that the proposed method can mine effective context by several local observations. Moreover, the precision and speed are easily improved by changing the number of recurrent steps. Finally, we will open the source code of our proposed approach.

View on arXiv PDF

Similar