Point-wise mutual information-based video segmentation with high temporal consistency
This work solves the problem of improving temporal consistency in video segmentation for computer vision applications, representing an incremental advance by focusing on a previously under-addressed low-level component.
The paper addresses temporally consistent boundary detection and hierarchical segmentation in videos using point-wise mutual information (PMI) of spatio-temporal voxels, achieving state-of-the-art performance in region metrics without relying on optical flow or learned motion models.
In this paper, we tackle the problem of temporally consistent boundary detection and hierarchical segmentation in videos. While finding the best high-level reasoning of region assignments in videos is the focus of much recent research, temporal consistency in boundary detection has so far only rarely been tackled. We argue that temporally consistent boundaries are a key component to temporally consistent region assignment. The proposed method is based on the point-wise mutual information (PMI) of spatio-temporal voxels. Temporal consistency is established by an evaluation of PMI-based point affinities in the spectral domain over space and time. Thus, the proposed method is independent of any optical flow computation or previously learned motion models. The proposed low-level video segmentation method outperforms the learning-based state of the art in terms of standard region metrics.