CVJun 3, 2025

SportMamba: Adaptive Non-Linear Multi-Object Tracking with State Space Models for Team Sports

arXiv:2506.03335v14 citationsh-index: 42025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
Originality Incremental advance
AI Analysis

This addresses tracking difficulties in dynamic team sports for applications like sports analytics, though it appears incremental as it builds on existing methods with specific adaptations.

The paper tackled the problem of multi-object tracking in team sports, which is challenging due to fast motion and occlusions, by introducing SportMamba, an adaptive hybrid technique that achieved state-of-the-art performance on the SportsMOT dataset and demonstrated generalization to an ice hockey dataset.

Multi-object tracking (MOT) in team sports is particularly challenging due to the fast-paced motion and frequent occlusions resulting in motion blur and identity switches, respectively. Predicting player positions in such scenarios is particularly difficult due to the observed highly non-linear motion patterns. Current methods are heavily reliant on object detection and appearance-based tracking, which struggle to perform in complex team sports scenarios, where appearance cues are ambiguous and motion patterns do not necessarily follow a linear pattern. To address these challenges, we introduce SportMamba, an adaptive hybrid MOT technique specifically designed for tracking in dynamic team sports. The technical contribution of SportMamba is twofold. First, we introduce a mamba-attention mechanism that models non-linear motion by implicitly focusing on relevant embedding dependencies. Second, we propose a height-adaptive spatial association metric to reduce ID switches caused by partial occlusions by accounting for scale variations due to depth changes. Additionally, we extend the detection search space with adaptive buffers to improve associations in fast-motion scenarios. Our proposed technique, SportMamba, demonstrates state-of-the-art performance on various metrics in the SportsMOT dataset, which is characterized by complex motion and severe occlusion. Furthermore, we demonstrate its generalization capability through zero-shot transfer to VIP-HTD, an ice hockey dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes