Learning Disentangled Representation with Mutual Information Maximization for Real-Time UAV Tracking
This work addresses efficiency and precision challenges in UAV tracking, which is critical for applications with limited computational resources, though it appears incremental as it builds on existing deep learning-based trackers.
The paper tackled the problem of improving precision and efficiency in UAV tracking by proposing a disentangled representation learning method with mutual information maximization, which significantly outperformed state-of-the-art methods on four UAV benchmarks.
Efficiency has been a critical problem in UAV tracking due to limitations in computation resources, battery capacity, and unmanned aerial vehicle maximum load. Although discriminative correlation filters (DCF)-based trackers prevail in this field for their favorable efficiency, some recently proposed lightweight deep learning (DL)-based trackers using model compression demonstrated quite remarkable CPU efficiency as well as precision. Unfortunately, the model compression methods utilized by these works, though simple, are still unable to achieve satisfying tracking precision with higher compression rates. This paper aims to exploit disentangled representation learning with mutual information maximization (DR-MIM) to further improve DL-based trackers' precision and efficiency for UAV tracking. The proposed disentangled representation separates the feature into an identity-related and an identity-unrelated features. Only the latter is used, which enhances the effectiveness of the feature representation for subsequent classification and regression tasks. Extensive experiments on four UAV benchmarks, including UAV123@10fps, DTB70, UAVDT and VisDrone2018, show that our DR-MIM tracker significantly outperforms state-of-the-art UAV tracking methods.