CLMar 20

PoC: Performance-oriented Context Compression for Large Language Models via Performance Prediction

arXiv:2603.1973382.6h-index: 14
AI Analysis

This addresses the need for reliable and efficient deployment of context compression in LLMs, though it is incremental as it builds on existing compression methods with a novel performance-oriented approach.

The paper tackles the problem of unpredictable performance degradation in context compression for Large Language Models by introducing Performance-oriented Context Compression (PoC), which uses a performance predictor to automatically find the most aggressive compression ratio that meets a specified performance floor, resulting in a superior overall performance compared to context-agnostic methods.

While context compression can mitigate the growing inference costs of Large Language Models (LLMs) by shortening contexts, existing methods that specify a target compression ratio or length suffer from unpredictable performance degradation, hindering their reliable deployment. We introduce a paradigm shift to Performance-oriented Context Compression (PoC), where developers specify an acceptable performance floor instead of a compression ratio. PoC employs a lightweight performance predictor to automatically find the most aggressive compression ratio that satisfies this constraint before steering an off-the-shelf compressor. We design and compare two predictor variants: a simple context-agnostic predictor and a more sophisticated context-aware one that considers the input's inherent compressibility. On both question-answering and summarization benchmarks, the context-aware predictor consistently achieves lower performance prediction error than the context-agnostic predictor, while the resulting context-aware PoC attains a superior overall performance. Our work paves the way for a more reliable, efficient, and performance-aware deployment of context compression for LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes