CVJun 20, 2025

YASMOT: Yet another stereo image multi-object tracker

arXiv:2506.17186v1h-index: 20
Originality Synthesis-oriented
AI Analysis

This provides a tool for improving object detection and enabling downstream tasks like behavior classification and abundance estimation, but it appears incremental as it builds on existing detectors without major breakthroughs.

The authors tackled the problem of tracking objects over time in image sequences by presenting yasmot, a lightweight and flexible tracker that processes outputs from object detectors for monoscopic or stereoscopic cameras, and includes functionality for generating consensus detections from detector ensembles.

There now exists many popular object detectors based on deep learning that can analyze images and extract locations and class labels for occurrences of objects. For image time series (i.e., video or sequences of stills), tracking objects over time and preserving object identity can help to improve object detection performance, and is necessary for many downstream tasks, including classifying and predicting behaviors, and estimating total abundances. Here we present yasmot, a lightweight and flexible object tracker that can process the output from popular object detectors and track objects over time from either monoscopic or stereoscopic camera configurations. In addition, it includes functionality to generate consensus detections from ensembles of object detectors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes