CLAIITMay 18, 2025

Measuring Information Distortion in Hierarchical Ultra long Novel Reconstruction:The Optimal Expansion Ratio

arXiv:2505.12572v2
Originality Incremental advance
AI Analysis

This addresses information preservation in ultra-long novel generation, but it is incremental as it builds on existing text compression methods.

The paper tackles the problem of semantic distortion in ultra-long novel reconstruction using a hierarchical framework, finding that an optimal compression-expansion ratio significantly reduces distortion compared to non-optimal ratios.

A two stage novel generation framework (outline -> section outline -> manuscript) is widely used in long novel generation,(e.g., \textsc{DOME}, \textsc{Plan\&Write}, \textsc{Long Writer}), but study of such framework in ultra long novel(>1M words) reconstruction is little. Building on recent text compression methods (\textsc{LLMZip}, \textsc{LLM2Vec}), we conduct an information-theoretic analysis to quantify semantic distortion under different compression-expansion ratios. We examine how outline length affects information preservation. Experiments on ultra-long novels show that the optimal compression-expansion ratio significantly reduces semantic distortion compared to other non-optimal compression-expansion ratio.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes