Real-time Person Re-identification at the Edge: A Mixed Precision Approach
This work addresses the need for efficient person re-identification for real-time multi-camera tracking applications on resource-limited edge devices, representing an incremental improvement in deployment optimization.
The paper tackles the problem of deploying person re-identification algorithms in real-time, computation-constrained edge scenarios by using a lightweight MobileNet-v2 model with mixed precision training. It achieves a 3.25x improvement in inference throughput to 27.77fps, a 1.75x reduction in training time, and a 1.45x decrease in power consumption, while only reducing accuracy by 5.6% compared to a ResNet-50 baseline.
A critical part of multi-person multi-camera tracking is person re-identification (re-ID) algorithm, which recognizes and retains identities of all detected unknown people throughout the video stream. Many re-ID algorithms today exemplify state of the art results, but not much work has been done to explore the deployment of such algorithms for computation and power constrained real-time scenarios. In this paper, we study the effect of using a light-weight model, MobileNet-v2 for re-ID and investigate the impact of single (FP32) precision versus half (FP16) precision for training on the server and inference on the edge nodes. We further compare the results with the baseline model which uses ResNet-50 on state of the art benchmarks including CUHK03, Market-1501, and Duke-MTMC. The MobileNet-V2 mixed precision training method can improve both inference throughput on the edge node, and training time on server $3.25\times$ reaching to 27.77fps and $1.75\times$, respectively and decreases power consumption on the edge node by $1.45\times$, while it deteriorates accuracy only 5.6\% in respect to ResNet-50 single precision on the average for three different datasets. The code and pre-trained networks are publicly available at https://github.com/TeCSAR-UNCC/person-reid.