The Solution for Single Object Tracking Task of Perception Test Challenge 2024
This work addresses the problem of tracking objects in videos for computer vision applications, but it is incremental as it applies an existing method to a new domain.
The authors tackled the Single Object Tracking task in the Perception Test Challenge 2024 by adapting the LoRAT method for visual tracking, achieving a score of 0.813 and first place in the competition.
This report presents our method for Single Object Tracking (SOT), which aims to track a specified object throughout a video sequence. We employ the LoRAT method. The essence of the work lies in adapting LoRA, a technique that fine-tunes a small subset of model parameters without adding inference latency, to the domain of visual tracking. We train our model using the extensive LaSOT and GOT-10k datasets, which provide a solid foundation for robust performance. Additionally, we implement the alpha-refine technique for post-processing the bounding box outputs. Although the alpha-refine method does not yield the anticipated results, our overall approach achieves a score of 0.813, securing first place in the competition.