Lost-in-the-Middle in Long-Text Generation: Synthetic Dataset, Evaluation Framework, and Mitigation
This addresses a bottleneck in long-input and long-output text generation tasks, which lack benchmarks and suffer from performance degradation, but is incremental as it builds on existing retrieval-augmented methods.
The paper tackles the 'lost-in-the-middle' problem in long-text generation by introducing a synthetic dataset and evaluation framework (LongInOutBench) and proposing RAL-Writer, which retrieves and restates overlooked content to mitigate the issue, showing effectiveness in evaluations.
Existing long-text generation methods primarily concentrate on producing lengthy texts from short inputs, neglecting the long-input and long-output tasks. Such tasks have numerous practical applications while lacking available benchmarks. Moreover, as the input grows in length, existing methods inevitably encounter the "lost-in-the-middle" phenomenon. In this paper, we first introduce a Long Input and Output Benchmark (LongInOutBench), including a synthetic dataset and a comprehensive evaluation framework, addressing the challenge of the missing benchmark. We then develop the Retrieval-Augmented Long-Text Writer (RAL-Writer), which retrieves and restates important yet overlooked content, mitigating the "lost-in-the-middle" issue by constructing explicit prompts. We finally employ the proposed LongInOutBench to evaluate our RAL-Writer against comparable baselines, and the results demonstrate the effectiveness of our approach. Our code has been released at https://github.com/OnlyAR/RAL-Writer.