Recurrent Segmentation for Variable Computational Budgets
This addresses the challenge of designing flexible segmentation systems for computer vision applications, though it is incremental as it adapts existing recurrent approaches to a new context.
The paper tackles the problem of building image segmentation systems that work across variable computational budgets by developing a recurrent neural network that improves predictions with each iteration, achieving similar performance to state-of-the-art methods on PASCAL VOC 2012 and Cityscapes datasets while enabling efficient video segmentation.
State-of-the-art systems for semantic image segmentation use feed-forward pipelines with fixed computational costs. Building an image segmentation system that works across a range of computational budgets is challenging and time-intensive as new architectures must be designed and trained for every computational setting. To address this problem we develop a recurrent neural network that successively improves prediction quality with each iteration. Importantly, the RNN may be deployed across a range of computational budgets by merely running the model for a variable number of iterations. We find that this architecture is uniquely suited for efficiently segmenting videos. By exploiting the segmentation of past frames, the RNN can perform video segmentation at similar quality but reduced computational cost compared to state-of-the-art image segmentation methods. When applied to static images in the PASCAL VOC 2012 and Cityscapes segmentation datasets, the RNN traces out a speed-accuracy curve that saturates near the performance of state-of-the-art segmentation methods.