Multiple object tracking with context awareness
This work addresses tracking challenges for applications such as surveillance and animation, but appears incremental as it builds on existing methods by incorporating more context.
The paper tackles the problem of multiple people tracking in crowded environments by focusing on data association, arguing that contextual information like social and spatial context from multiple views is underutilized, but does not report specific results or numbers.
Multiple people tracking is a key problem for many applications such as surveillance, animation or car navigation, and a key input for tasks such as activity recognition. In crowded environments occlusions and false detections are common, and although there have been substantial advances in recent years, tracking is still a challenging task. Tracking is typically divided into two steps: detection, i.e., locating the pedestrians in the image, and data association, i.e., linking detections across frames to form complete trajectories. For the data association task, approaches typically aim at developing new, more complex formulations, which in turn put the focus on the optimization techniques required to solve them. However, they still utilize very basic information such as distance between detections. In this thesis, I focus on the data association task and argue that there is contextual information that has not been fully exploited yet in the tracking community, mainly social context and spatial context coming from different views.