CLLGMLJun 24, 2024

Cascade Reward Sampling for Efficient Decoding-Time Alignment

arXiv:2406.16306v336 citations
Originality Incremental advance
AI Analysis

This addresses efficiency bottlenecks in aligning LLMs with human preferences for practical applications, representing an incremental improvement over existing decoding-time methods.

The paper tackled the inefficiency of decoding-time alignment in large language models by introducing Cascade Reward Sampling (CARDS), which reduced decoding time by about 70% and achieved over 90% win-ties in utility and safety benchmarks.

Aligning large language models (LLMs) with human preferences is essential for their applications. Recently, decoding-time alignment has emerged as an effective plug-and-play technique that avoids fine-tuning model parameters. This approach retains the general utility of pretrained LLMs but often suffers from significant inefficiencies during decoding, primarily due to wasted token generation and excessive reward evaluations. To address these challenges, we introduce Cascade Reward Sampling (CARDS) to resolve both efficiency bottlenecks in decoding-time alignment. Specifically, we develop a segment-level rejection sampling algorithm that minimizes redundant computations of both LLMs and reward models (RMs). Central to CARDS is an uncertainty-based segmentation mechanism, which ensures the accuracy of RMs evaluations on incomplete segments. Furthermore, we provide a detailed analysis of reward scores on segments to elucidate the improved alignment performance. Experimental results demonstrate that CARDS significantly improves decoding efficiency, alignment quality, and general utility compared to existing decoding-time alignment methods, achieving approximately a 70% reduction in decoding time and over 90% win-ties in utility and safety benchmarks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes