CLJun 20, 2024

Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

Taiming Lu, Muhan Gao, Kuai Yu, Adam Byerly, Daniel Khashabi

arXiv:2406.14673v220.441 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses a key limitation in LLMs for applications requiring long-context reasoning, though it is incremental as it analyzes existing failures without proposing a new solution.

The study investigated LLMs' difficulty in using information from long contexts, finding that while they encode target positions, they often fail to generate accurate responses, revealing a 'know but don't tell' phenomenon.

Large Language Models (LLMs) exhibit positional bias, struggling to utilize information from the middle or end of long contexts. Our study explores LLMs' long-context reasoning by probing their hidden representations. We find that while LLMs encode the position of target information, they often fail to leverage this in generating accurate responses. This reveals a disconnect between information retrieval and utilization, a "know but don't tell" phenomenon. We further analyze the relationship between extraction time and final accuracy, offering insights into the underlying mechanics of transformer models.

View on arXiv PDF Code

Similar