LGDec 12, 2025

Learning to Extract Context for Context-Aware LLM Inference

Minseon Kim, Lucas Caccia, Zhengyan Shi, Matheus Pereira, Marc-Alexandre Côté, Xingdi Yuan, Alessandro Sordoni

arXiv:2512.11986v17.11 citationsh-index: 35

Originality Incremental advance

AI Analysis

This addresses safety and reliability issues in LLMs for users by reducing harmful outputs and unnecessary refusals, though it is incremental as it builds on existing context-aware methods.

The paper tackles the problem of ambiguous user prompts in large language models (LLMs) by proposing a framework that extracts contextual information from prompts to guide responses, resulting in a 5.6% reduction in harmful responses on SafetyInstruct and a 6.2% improvement in the harmonic mean of attack success rate and compliance on XSTest and WildJailbreak.

User prompts to large language models (LLMs) are often ambiguous or under-specified, and subtle contextual cues shaped by user intentions, prior knowledge, and risk factors strongly influence what constitutes an appropriate response. Misinterpreting intent or risks may lead to unsafe outputs, while overly cautious interpretations can cause unnecessary refusal of benign requests. In this paper, we question the conventional framework in which LLMs generate immediate responses to requests without considering broader contextual factors. User requests are situated within broader contexts such as intentions, knowledge, and prior experience, which strongly influence what constitutes an appropriate answer. We propose a framework that extracts and leverages such contextual information from the user prompt itself. Specifically, a reinforcement learning based context generator, designed in an autoencoder-like fashion, is trained to infer contextual signals grounded in the prompt and use them to guide response generation. This approach is particularly important for safety tasks, where ambiguous requests may bypass safeguards while benign but confusing requests can trigger unnecessary refusals. Experiments show that our method reduces harmful responses by an average of 5.6% on the SafetyInstruct dataset across multiple foundation models and improves the harmonic mean of attack success rate and compliance on benign prompts by 6.2% on XSTest and WildJailbreak. These results demonstrate the effectiveness of context extraction for safer and more reliable LLM inferences.

View on arXiv PDF

Similar