MeNToS: Tracklets Association with a Space-Time Memory Network
This addresses the need for improved data association in MOTS, particularly for applications like robotics, but it is incremental as it adapts an existing network to a new task.
The authors tackled the data association problem in multi-object tracking and segmentation by proposing MeNToS, a method that uses a space-time memory network to associate tracklets with temporal gaps without fine-tuning, achieving 4th place in the RobMOTS challenge.
We propose a method for multi-object tracking and segmentation (MOTS) that does not require fine-tuning or per benchmark hyperparameter selection. The proposed method addresses particularly the data association problem. Indeed, the recently introduced HOTA metric, that has a better alignment with the human visual assessment by evenly balancing detections and associations quality, has shown that improvements are still needed for data association. After creating tracklets using instance segmentation and optical flow, the proposed method relies on a space-time memory network (STM) developed for one-shot video object segmentation to improve the association of tracklets with temporal gaps. To the best of our knowledge, our method, named MeNToS, is the first to use the STM network to track object masks for MOTS. We took the 4th place in the RobMOTS challenge. The project page is https://mehdimiah.com/mentos.html.