Superpixel-based Semantic Segmentation Trained by Statistical Process Control
This work addresses computational inefficiency in semantic segmentation for computer vision applications, though it appears incremental as it builds on existing deep learning and superpixel techniques.
The paper tackles the redundancy in semantic segmentation by training and testing with only 0.37% of total pixels using superpixel-based sampling, which reduces computational complexity and achieves performance equal to or better than conventional methods on Pascal Context and SUN-RGBD datasets.
Semantic segmentation, like other fields of computer vision, has seen a remarkable performance advance by the use of deep convolution neural networks. However, considering that neighboring pixels are heavily dependent on each other, both learning and testing of these methods have a lot of redundant operations. To resolve this problem, the proposed network is trained and tested with only 0.37% of total pixels by superpixel-based sampling and largely reduced the complexity of upsampling calculation. The hypercolumn feature maps are constructed by pyramid module in combination with the convolution layers of the base network. Since the proposed method uses a very small number of sampled pixels, the end-to-end learning of the entire network is difficult with a common learning rate for all the layers. In order to resolve this problem, the learning rate after sampling is controlled by statistical process control (SPC) of gradients in each layer. The proposed method performs better than or equal to the conventional methods that use much more samples on Pascal Context, SUN-RGBD dataset.