Towards Good Practices on Building Effective CNN Baseline Model for Person Re-identification
This work addresses the incremental improvement of baseline models for researchers in person re-identification, a domain-specific computer vision task.
The paper tackles the problem of building effective CNN baseline models for person re-identification by proposing three good practices, such as adding batch normalization and using Adam optimizer, which achieve state-of-the-art performance on three benchmark datasets.
Person re-identification is indeed a challenging visual recognition task due to the critical issues of human pose variation, human body occlusion, camera view variation, etc. To address this, most of the state-of-the-art approaches are proposed based on deep convolutional neural network (CNN), being leveraged by its strong feature learning power and classification boundary fitting capacity. Although the vital role towards person re-identification, how to build effective CNN baseline model has not been well studied yet. To answer this open question, we propose 3 good practices in this paper from the perspectives of adjusting CNN architecture and training procedure. In particular, they are adding batch normalization after the global pooling layer, executing identity categorization directly using only one fully-connected, and using Adam as optimizer. The extensive experiments on 3 widely-used benchmark datasets demonstrate that, our propositions essentially facilitate the CNN baseline model to achieve the state-of-the-art performance without any other high-level domain knowledge or low-level technical trick.