LGCLCVNEMLApr 19, 2019

Challenges and Prospects in Vision and Language Research

arXiv:1904.09317v243 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of misleading progress in AI evaluation for researchers and developers, but it is incremental as it builds on existing critiques.

The paper reviews how current vision-language systems achieve high performance due to dataset and evaluation flaws rather than genuine intelligence, and proposes a path forward for more robust benchmarks.

Language grounded image understanding tasks have often been proposed as a method for evaluating progress in artificial intelligence. Ideally, these tasks should test a plethora of capabilities that integrate computer vision, reasoning, and natural language understanding. However, rather than behaving as visual Turing tests, recent studies have demonstrated state-of-the-art systems are achieving good performance through flaws in datasets and evaluation procedures. We review the current state of affairs and outline a path forward.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes