CL LGApr 16, 2014

Open Question Answering with Weakly Supervised Embedding Models

Antoine Bordes, Jason Weston, Nicolas Usunier

arXiv:1404.4326v1353 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of building scalable question-answering systems for any subject without extensive human labeling or domain-specific tools, though it is incremental as it builds on prior weakly supervised approaches.

The paper tackles the problem of open-domain question answering by learning to map questions and answers into a shared vector space, enabling schema-agnostic queries without grammars or lexicons, and achieves major improvements over the existing Paralex method using weakly supervised data.

Building computers able to answer questions on any subject is a long standing goal of artificial intelligence. Promising progress has recently been achieved by methods that learn to map questions to logical forms or database queries. Such approaches can be effective but at the cost of either large amounts of human-labeled data or by defining lexicons and grammars tailored by practitioners. In this paper, we instead take the radical approach of learning to map questions to vectorial feature representations. By mapping answers into the same space one can query any knowledge base independent of its schema, without requiring any grammar or lexicon. Our method is trained with a new optimization procedure combining stochastic gradient descent followed by a fine-tuning step using the weak supervision provided by blending automatically and collaboratively generated resources. We empirically demonstrate that our model can capture meaningful signals from its noisy supervision leading to major improvements over paralex, the only existing method able to be trained on similar weakly labeled data.

View on arXiv PDF

Similar