AIMay 24, 2017

Logic Tensor Networks for Semantic Image Interpretation

arXiv:1705.08968v1240 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of extracting structured semantic descriptions from images for computer vision applications, representing an incremental advance by integrating existing SRL methods into SII tasks.

The paper tackles the problem of Semantic Image Interpretation by applying Logic Tensor Networks to classify bounding boxes and detect part-of relations in images, showing that incorporating logical constraints improves performance over purely data-driven methods like Fast R-CNN and adds robustness to label errors.

Semantic Image Interpretation (SII) is the task of extracting structured semantic descriptions from images. It is widely agreed that the combined use of visual data and background knowledge is of great importance for SII. Recently, Statistical Relational Learning (SRL) approaches have been developed for reasoning under uncertainty and learning in the presence of data and rich knowledge. Logic Tensor Networks (LTNs) are an SRL framework which integrates neural networks with first-order fuzzy logic to allow (i) efficient learning from noisy data in the presence of logical constraints, and (ii) reasoning with logical formulas describing general properties of the data. In this paper, we develop and apply LTNs to two of the main tasks of SII, namely, the classification of an image's bounding boxes and the detection of the relevant part-of relations between objects. To the best of our knowledge, this is the first successful application of SRL to such SII tasks. The proposed approach is evaluated on a standard image processing benchmark. Experiments show that the use of background knowledge in the form of logical constraints can improve the performance of purely data-driven approaches, including the state-of-the-art Fast Region-based Convolutional Neural Networks (Fast R-CNN). Moreover, we show that the use of logical background knowledge adds robustness to the learning system when errors are present in the labels of the training data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes