Backbone Can Not be Trained at Once: Rolling Back to Pre-trained Network for Person Re-Identification
This addresses a specific bottleneck in fine-tuning for person re-identification, offering an incremental improvement for researchers and practitioners in computer vision.
The paper tackles the gradient vanishing problem in fine-tuning low-level layers for person re-identification by proposing a strategy that rolls back high-level layer weights to their pre-trained initial values, resulting in state-of-the-art performance without additional modules.
In person re-identification (ReID) task, because of its shortage of trainable dataset, it is common to utilize fine-tuning method using a classification network pre-trained on a large dataset. However, it is relatively difficult to sufficiently fine-tune the low-level layers of the network due to the gradient vanishing problem. In this work, we propose a novel fine-tuning strategy that allows low-level layers to be sufficiently trained by rolling back the weights of high-level layers to their initial pre-trained weights. Our strategy alleviates the problem of gradient vanishing in low-level layers and robustly trains the low-level layers to fit the ReID dataset, thereby increasing the performance of ReID tasks. The improved performance of the proposed strategy is validated via several experiments. Furthermore, without any add-ons such as pose estimation or segmentation, our strategy exhibits state-of-the-art performance using only vanilla deep convolutional neural network architecture.