Semi-Supervised First-Person Activity Recognition in Body-Worn Video
This addresses the challenge of activity recognition in unbalanced, privacy-sensitive police body-worn video with limited labeled data.
The paper tackled the problem of classifying frames in body-worn video footage to recognize the camera-wearer's activities, focusing on real-world police applications, and achieved results comparable to or better than supervised and deep learning methods using significantly less training data.
Body-worn cameras are now commonly used for logging daily life, sports, and law enforcement activities, creating a large volume of archived footage. This paper studies the problem of classifying frames of footage according to the activity of the camera-wearer with an emphasis on application to real-world police body-worn video. Real-world datasets pose a different set of challenges from existing egocentric vision datasets: the amount of footage of different activities is unbalanced, the data contains personally identifiable information, and in practice it is difficult to provide substantial training footage for a supervised approach. We address these challenges by extracting features based exclusively on motion information then segmenting the video footage using a semi-supervised classification algorithm. On publicly available datasets, our method achieves results comparable to, if not better than, supervised and/or deep learning methods using a fraction of the training data. It also shows promising results on real-world police body-worn video.