CLAug 6, 2024

Making Long-Context Language Models Better Multi-Hop Reasoners

arXiv:2408.03246v134 citationsh-index: 14Has Code
Originality Highly original
AI Analysis

This addresses multi-hop reasoning challenges in NLP for users of long-context language models, representing an incremental improvement with a novel method for a known bottleneck.

The paper tackles the problem of long-context language models struggling with multi-hop reasoning and noise by introducing Reasoning with Attributions, which prompts models to provide attributions for assertions during reasoning, achieving competitive performance on benchmarks comparable to proprietary models like ChatGPT and Claude-instant.

Recent advancements in long-context modeling have enhanced language models (LMs) for complex tasks across multiple NLP applications. Despite this progress, we find that these models struggle with multi-hop reasoning and exhibit decreased performance in the presence of noisy contexts. In this paper, we introduce Reasoning with Attributions, a novel approach that prompts LMs to supply attributions for each assertion during their reasoning. We validate our approach through experiments on three multi-hop datasets, employing both proprietary and open-source models, and demonstrate its efficacy and resilience. Furthermore, we explore methods to augment reasoning capabilities via fine-tuning and offer an attribution-annotated dataset and a specialized training strategy. Our fine-tuned model achieves competitive performance on multi-hop reasoning benchmarks, closely paralleling proprietary LMs such as ChatGPT and Claude-instant.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes