An Integrated Approach to Crowd Video Analysis: From Tracking to Multi-level Activity Recognition
This work addresses the challenge of real-time crowd analysis for surveillance or safety applications, though it is incremental as it builds on existing methods in a unified way.
The authors tackled the problem of analyzing crowd videos by developing an integrated framework for simultaneous tracking, group detection, and multi-level activity recognition, achieving competitive performance with state-of-the-art batch methods in an online setting.
We present an integrated framework for simultaneous tracking, group detection and multi-level activity recognition in crowd videos. Instead of solving these problems independently and sequentially, we solve them together in a unified framework to utilize the strong correlation that exists among individual motion, groups, and activities. We explore the hierarchical structure hidden in the video that connects individuals over time to produce tracks, connects individuals to form groups and also connects groups together to form a crowd. We show that estimation of this hidden structure corresponds to track association and group detection. We estimate this hidden structure under a linear programming formulation. The obtained graphical representation is further explored to recognize the node values that corresponds to multi-level activity recognition. This problem is solved under a structured SVM framework. The results on publicly available dataset show very competitive performance at all levels of granularity with the state-of-the-art batch processing methods despite the proposed technique being an online (causal) one.