Region-Based Multiscale Spatiotemporal Saliency for Video
This addresses the problem of efficiently detecting salient objects in videos for applications like video compression or surveillance, though it appears incremental.
The paper tackles video salient object detection by combining static and dynamic features across spatial and temporal dimensions, using multiscale segmentation and adaptive temporal windows. The method outperforms existing state-of-the-art approaches on several benchmark datasets.
Detecting salient objects from a video requires exploiting both spatial and temporal knowledge included in the video. We propose a novel region-based multiscale spatiotemporal saliency detection method for videos, where static features and dynamic features computed from the low and middle levels are combined together. Our method utilizes such combined features spatially over each frame and, at the same time, temporally across frames using consistency between consecutive frames. Saliency cues in our method are analyzed through a multiscale segmentation model, and fused across scale levels, yielding to exploring regions efficiently. An adaptive temporal window using motion information is also developed to combine saliency values of consecutive frames in order to keep temporal consistency across frames. Performance evaluation on several popular benchmark datasets validates that our method outperforms existing state-of-the-arts.