Latent Bi-constraint SVM for Video-based Object Recognition
This addresses the relatively unexplored problem of object recognition from video input, which is incremental as it builds on existing SVM methods.
The paper tackles video-based object recognition by proposing Latent Bi-constraint SVM (LBSVM), a maximum-margin framework that extends Structured-Output SVM to handle noisy video data and ensure temporal consistency, and demonstrates its benefits over existing methods on new datasets for office objects and museum sculptures.
We address the task of recognizing objects from video input. This important problem is relatively unexplored, compared with image-based object recognition. To this end, we make the following contributions. First, we introduce two comprehensive datasets for video-based object recognition. Second, we propose Latent Bi-constraint SVM (LBSVM), a maximum-margin framework for video-based object recognition. LBSVM is based on Structured-Output SVM, but extends it to handle noisy video data and ensure consistency of the output decision throughout time. We apply LBSVM to recognize office objects and museum sculptures, and we demonstrate its benefits over image-based, set-based, and other video-based object recognition.