CLAISep 19, 2024

Pay Attention to What Matters

arXiv:2409.19001v11 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses a key limitation in LLM usability for users who need precise instruction adherence, though it appears incremental as it builds on existing attention mechanisms.

The paper tackles the problem of Large Language Models having limited capability to align outputs with user instructions by introducing GUIDE, a method that increases attention scores in instruction tokens, resulting in accuracy improvements from 29.4% to 60.4% for instruction following.

Despite the remarkable success of Large Language Models (LLMs), they still exhibit a limited capability to align their outputs to the user instructions. In this work, we introduce a simple and effective method, which we name GUIDE, that mechanistically increases attention scores in instruction tokens. To support this operation, we present Influence, a novel metric that highlights how the user's instructions propagate through the transformer layers and impact the LLM output. Our results show that GUIDE improves the accuracy of following instructions 29.4 % to 60.4%, outperforming natural prompting alternatives and Supervised Fine-Tuning up to 1M tokens.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes