CVAILGMMNEMay 20, 2021

Robust Unsupervised Multi-Object Tracking in Noisy Environments

arXiv:2105.10005v41 citations
Originality Incremental advance
AI Analysis

This addresses the robustness issue in unsupervised MOT for applications like surveillance or autonomous systems in unpredictable conditions, though it appears incremental as it builds on existing unsupervised methods.

The paper tackles the problem of unsupervised multi-object tracking (MOT) in noisy video environments, showing that existing methods degrade sharply with added noise, and introduces AttU-Net, which improves performance over state-of-the-art baselines on benchmarks like MNIST-MOT and Atari game videos.

Physical processes, camera movement, and unpredictable environmental conditions like the presence of dust can induce noise and artifacts in video feeds. We observe that popular unsupervised MOT methods are dependent on noise-free inputs. We show that the addition of a small amount of artificial random noise causes a sharp degradation in model performance on benchmark metrics. We resolve this problem by introducing a robust unsupervised multi-object tracking (MOT) model: AttU-Net. The proposed single-head attention model helps limit the negative impact of noise by learning visual representations at different segment scales. AttU-Net shows better unsupervised MOT tracking performance over variational inference-based state-of-the-art baselines. We evaluate our method in the MNIST-MOT and the Atari game video benchmark. We also provide two extended video datasets: ``Kuzushiji-MNIST MOT'' which consists of moving Japanese characters and ``Fashion-MNIST MOT'' to validate the effectiveness of the MOT models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes