CLMay 29, 2018

Multi-hop Inference for Sentence-level TextGraphs: How Challenging is Meaningfully Combining Information for Science Question Answering?

arXiv:1805.11267v11107 citations
Originality Synthesis-oriented
AI Analysis

This addresses a major barrier for inference models in handling complex science questions that require combining multiple facts, though it is incremental as it focuses on empirical characterization rather than a new solution.

The paper tackled the challenge of multi-hop inference for science question answering by empirically characterizing the difficulty of aggregating facts in sentence-level text graphs, finding that semantic drift leads to low aggregation quality between 0.04% and 3% across datasets.

Question Answering for complex questions is often modeled as a graph construction or traversal task, where a solver must build or traverse a graph of facts that answer and explain a given question. This "multi-hop" inference has been shown to be extremely challenging, with few models able to aggregate more than two facts before being overwhelmed by "semantic drift", or the tendency for long chains of facts to quickly drift off topic. This is a major barrier to current inference models, as even elementary science questions require an average of 4 to 6 facts to answer and explain. In this work we empirically characterize the difficulty of building or traversing a graph of sentences connected by lexical overlap, by evaluating chance sentence aggregation quality through 9,784 manually-annotated judgments across knowledge graphs built from three free-text corpora (including study guides and Simple Wikipedia). We demonstrate semantic drift tends to be high and aggregation quality low, at between 0.04% and 3%, and highlight scenarios that maximize the likelihood of meaningfully combining information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes