Optimizing Decomposition for Optimal Claim Verification
This work addresses a specific bottleneck in factuality evaluation for long-form text, offering incremental improvements over existing methods.
The paper tackles the problem of suboptimal verification in the Decompose-Then-Verify paradigm by addressing misalignment between decomposition and verification, proposing a reinforcement learning framework that improves verification confidence by 0.07 and accuracy by 0.12 on average.
Current research on the \textit{Decompose-Then-Verify} paradigm for evaluating the factuality of long-form text typically treats decomposition and verification in isolation, overlooking their interactions and potential misalignment. We find that existing decomposition policies, typically hand-crafted demonstrations, do not align well with downstream verifiers in terms of atomicity -- a novel metric quantifying information density -- leading to suboptimal verification results. We formulate finding the optimal decomposition policy for optimal verification as a bilevel optimization problem. To approximate a solution for this strongly NP-hard problem, we propose dynamic decomposition, a reinforcement learning framework that leverages verifier feedback to learn a policy for dynamically decomposing claims to verifier-preferred atomicity. Experimental results show that dynamic decomposition outperforms existing decomposition policies, improving verification confidence by 0.07 and accuracy by 0.12 (on a 0-1 scale) on average across varying verifiers, datasets, and atomcities of input claims.