Context-based Image Segment Labeling (CBISL)
This addresses the issue of incomplete semantic information in images for computer vision applications, but it is incremental as it builds on existing gated PixelCNNs.
The paper tackled the problem of recovering semantic image features like objects and positions in images with missing regions, and the result was that their four-directional model outperformed one-directional models and achieved human-comparable performance.
Working with images, one often faces problems with incomplete or unclear information. Image inpainting can be used to restore missing image regions but focuses, however, on low-level image features such as pixel intensity, pixel gradient orientation, and color. This paper aims to recover semantic image features (objects and positions) in images. Based on published gated PixelCNNs, we demonstrate a new approach referred to as quadro-directional PixelCNN to recover missing objects and return probable positions for objects based on the context. We call this approach context-based image segment labeling (CBISL). The results suggest that our four-directional model outperforms one-directional models (gated PixelCNN) and returns a human-comparable performance.