CLCVLGNEMar 28, 2016

Deep Embedding for Spatial Role Labeling

arXiv:1603.08474v116 citations
Originality Incremental advance
AI Analysis

This work addresses spatial role labeling for natural language processing applications, presenting an incremental improvement with new features and fine-tuning methods.

The paper tackles the problem of recognizing spatial relations between objects in text by introducing VIEW, a visually informed word embedding trained on COCO data, and achieves a 2.1% improvement in F1 score on the SpaceEval benchmark.

This paper introduces the visually informed embedding of word (VIEW), a continuous vector representation for a word extracted from a deep neural model trained using the Microsoft COCO data set to forecast the spatial arrangements between visual objects, given a textual description. The model is composed of a deep multilayer perceptron (MLP) stacked on the top of a Long Short Term Memory (LSTM) network, the latter being preceded by an embedding layer. The VIEW is applied to transferring multimodal background knowledge to Spatial Role Labeling (SpRL) algorithms, which recognize spatial relations between objects mentioned in the text. This work also contributes with a new method to select complementary features and a fine-tuning method for MLP that improves the $F1$ measure in classifying the words into spatial roles. The VIEW is evaluated with the Task 3 of SemEval-2013 benchmark data set, SpaceEval.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes