CL AIMay 18, 2025

PSC: Extending Context Window of Large Language Models via Phase Shift Calibration

arXiv:2505.12423v124 citationsh-index: 2Has CodeEMNLP

Originality Incremental advance

AI Analysis

This work addresses a key bottleneck in scaling LLMs for longer contexts, offering an incremental improvement over prior methods.

The paper tackles the challenge of extending the context window of large language models using Rotary Position Embedding (RoPE) by introducing Phase Shift Calibration (PSC), a module that enhances existing methods like PI, YaRN, and LongRoPE, resulting in reduced perplexity by up to 64k context window sizes.

Rotary Position Embedding (RoPE) is an efficient position encoding approach and is widely utilized in numerous large language models (LLMs). Recently, a lot of methods have been put forward to further expand the context window based on RoPE. The core concept of those methods is to predefine or search for a set of factors to rescale the base frequencies of RoPE. Nevertheless, it is quite a challenge for existing methods to predefine an optimal factor due to the exponential search space. In view of this, we introduce PSC (Phase Shift Calibration), a small module for calibrating the frequencies predefined by existing methods. With the employment of PSC, we demonstrate that many existing methods can be further enhanced, like PI, YaRN, and LongRoPE. We conducted extensive experiments across multiple models and tasks. The results demonstrate that (1) when PSC is enabled, the comparative reductions in perplexity increase as the context window size is varied from 16k, to 32k, and up to 64k. (2) Our approach is broadly applicable and exhibits robustness across a variety of models and tasks. The code can be found at https://github.com/WNQzhu/PSC.

View on arXiv PDF Code

Similar