CrowdMOT: Crowdsourcing Strategies for Tracking Multiple Objects in Videos
This work addresses the challenge of scalable video object tracking for applications like surveillance or biology, but it is incremental as it builds on existing crowdsourcing frameworks.
The paper tackled the problem of low-quality object tracking in videos using non-expert crowdworkers, especially when objects split, by introducing CrowdMOT and investigating micro-task design decisions, resulting in strategies for efficiently collecting higher quality annotations than state-of-the-art crowdsourcing systems.
Crowdsourcing is a valuable approach for tracking objects in videos in a more scalable manner than possible with domain experts. However, existing frameworks do not produce high quality results with non-expert crowdworkers, especially for scenarios where objects split. To address this shortcoming, we introduce a crowdsourcing platform called CrowdMOT, and investigate two micro-task design decisions: (1) whether to decompose the task so that each worker is in charge of annotating all objects in a sub-segment of the video versus annotating a single object across the entire video, and (2) whether to show annotations from previous workers to the next individuals working on the task. We conduct experiments on a diversity of videos which show both familiar objects (aka - people) and unfamiliar objects (aka - cells). Our results highlight strategies for efficiently collecting higher quality annotations than observed when using strategies employed by today's state-of-art crowdsourcing system.