CVApr 24, 2023

Track Anything: Segment Anything Meets Videos

Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng

arXiv:2304.11968v239.2338 citationsh-index: 25Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of video object tracking and segmentation for researchers and practitioners, offering an interactive tool that builds upon the Segment Anything Model but is incremental in its application to videos.

The authors tackled the problem of consistent segmentation in videos by proposing the Track Anything Model (TAM), which achieves high-performance interactive tracking and segmentation with minimal human input, such as several clicks, and provides satisfactory results in one-pass inference without additional training.

Recently, the Segment Anything Model (SAM) gains lots of attention rapidly due to its impressive segmentation performance on images. Regarding its strong ability on image segmentation and high interactivity with different prompts, we found that it performs poorly on consistent segmentation in videos. Therefore, in this report, we propose Track Anything Model (TAM), which achieves high-performance interactive tracking and segmentation in videos. To be detailed, given a video sequence, only with very little human participation, i.e., several clicks, people can track anything they are interested in, and get satisfactory results in one-pass inference. Without additional training, such an interactive design performs impressively on video object tracking and segmentation. All resources are available on {https://github.com/gaomingqi/Track-Anything}. We hope this work can facilitate related research.

View on arXiv PDF Code

Similar