LGCLCVNov 19, 2015

Order-Embeddings of Images and Language

arXiv:1511.06361v6595 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the need for better cross-modal understanding in AI, though it appears incremental as it builds on existing tasks with a new method.

The paper tackled the problem of modeling visual-semantic hierarchies across words, sentences, and images by explicitly representing partial order structures, resulting in improved performance for hypernym prediction and image-caption retrieval.

Hypernymy, textual entailment, and image captioning can be seen as special cases of a single visual-semantic hierarchy over words, sentences, and images. In this paper we advocate for explicitly modeling the partial order structure of this hierarchy. Towards this goal, we introduce a general method for learning ordered representations, and show how it can be applied to a variety of tasks involving images and language. We show that the resulting representations improve performance over current approaches for hypernym prediction and image-caption retrieval.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes