Semi-Automatic Annotation For Visual Object Tracking
This addresses the labor-intensive annotation process for visual object tracking datasets, though it appears incremental as it builds on existing tracking-by-detection and MHT techniques.
The paper tackles the problem of reducing manual annotation workload for visual object tracking by proposing a semi-automatic method that uses tracking-by-detection with Multiple Hypothesis Tracking, achieving up to 96% reduction in annotation workload on the AUTH Multidrone Dataset.
We propose a semi-automatic bounding box annotation method for visual object tracking by utilizing temporal information with a tracking-by-detection approach. For detection, we use an off-the-shelf object detector which is trained iteratively with the annotations generated by the proposed method, and we perform object detection on each frame independently. We employ Multiple Hypothesis Tracking (MHT) to exploit temporal information and to reduce the number of false-positives which makes it possible to use lower objectness thresholds for detection to increase recall. The tracklets formed by MHT are evaluated by human operators to enlarge the training set. This novel incremental learning approach helps to perform annotation iteratively. The experiments performed on AUTH Multidrone Dataset reveal that the annotation workload can be reduced up to 96% by the proposed approach.