CL AI CV LGJun 5, 2017

Deep learning evaluation using deep linguistic processing

arXiv:1706.01322v239.21095 citations

Originality Synthesis-oriented

AI Analysis

This work addresses evaluation challenges in AI for researchers, offering a method to create challenging datasets for better model assessment, though it is incremental as it builds on existing linguistic technology.

The paper tackles the problem of evaluating multimodal deep learning models by proposing the use of artificial data created with deep linguistic processing to complement standard evaluation methods, enabling detailed investigation of language understanding abilities beyond single performance metrics.

We discuss problems with the standard approaches to evaluation for tasks like visual question answering, and argue that artificial data can be used to address these as a complement to current practice. We demonstrate that with the help of existing 'deep' linguistic processing technology we are able to create challenging abstract datasets, which enable us to investigate the language understanding abilities of multimodal deep learning models in detail, as compared to a single performance value on a static and monolithic dataset.

View on arXiv PDF

Similar