Teaching Machines to Read and Comprehend
This work addresses the problem of teaching machines to read and comprehend natural language for AI researchers, providing a foundational dataset and method for evaluation.
The authors tackled the lack of large-scale datasets for machine reading comprehension by defining a new methodology to provide supervised data, enabling the development of attention-based deep neural networks that answer complex questions from real documents with minimal prior linguistic knowledge.
Teaching machines to read natural language documents remains an elusive challenge. Machine reading systems can be tested on their ability to answer questions posed on the contents of documents that they have seen, but until now large scale training and test datasets have been missing for this type of evaluation. In this work we define a new methodology that resolves this bottleneck and provides large scale supervised reading comprehension data. This allows us to develop a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure.