Pixel-wise object tracking
This work addresses pixel-level object tracking for applications like video analysis, but it appears incremental as it builds on existing tracking methods with hybrid components.
The authors tackled the problem of pixel-wise visual object tracking for anonymous objects in noisy backgrounds by proposing a framework with global attention and local segmentation models using LSTM structures, achieving real-time performance on the VOT dataset.
In this paper, we propose a novel pixel-wise visual object tracking framework that can track any anonymous object in a noisy background. The framework consists of two submodels, a global attention model and a local segmentation model. The global model generates a region of interests (ROI) that the object may lie in the new frame based on the past object segmentation maps, while the local model segments the new image in the ROI. Each model uses a LSTM structure to model the temporal dynamics of the motion and appearance, respectively. To circumvent the dependency of the training data between the two models, we use an iterative update strategy. Once the models are trained, there is no need to refine them to track specific objects, making our method efficient compared to online learning approaches. We demonstrate our real time pixel-wise object tracking framework on a challenging VOT dataset