Constrained Convolutional Neural Networks for Weakly Supervised Segmentation
This work addresses the problem of reducing annotation effort for semantic segmentation in computer vision, presenting an incremental improvement over existing weakly supervised methods.
The paper tackles weakly supervised semantic image segmentation by learning dense pixel-wise labeling from image-level tags using a novel loss function in a Constrained Convolutional Neural Network (CCNN), achieving state-of-the-art results with improved performance when adding slightly more supervision.
We present an approach to learn a dense pixel-wise labeling from image-level tags. Each image-level tag imposes constraints on the output labeling of a Convolutional Neural Network (CNN) classifier. We propose Constrained CNN (CCNN), a method which uses a novel loss function to optimize for any set of linear constraints on the output space (i.e. predicted label distribution) of a CNN. Our loss formulation is easy to optimize and can be incorporated directly into standard stochastic gradient descent optimization. The key idea is to phrase the training objective as a biconvex optimization for linear models, which we then relax to nonlinear deep networks. Extensive experiments demonstrate the generality of our new learning framework. The constrained loss yields state-of-the-art results on weakly supervised semantic image segmentation. We further demonstrate that adding slightly more supervision can greatly improve the performance of the learning algorithm.