Improving Zero-shot Sentence Decontextualisation with Content Selection and Planning
This addresses the issue of ambiguous sentences in NLP tasks like evidence extraction, though it is incremental as it builds on prior decontextualisation work.
The paper tackles the problem of making extracted sentences understandable out of context by proposing a zero-shot decontextualisation framework that selects and plans content, resulting in improved semantic integrity and discourse coherence compared to existing methods.
Extracting individual sentences from a document as evidence or reasoning steps is commonly done in many NLP tasks. However, extracted sentences often lack context necessary to make them understood, e.g., coreference and background information. To this end, we propose a content selection and planning framework for zero-shot decontextualisation, which determines what content should be mentioned and in what order for a sentence to be understood out of context. Specifically, given a potentially ambiguous sentence and its context, we first segment it into basic semantically-independent units. We then identify potentially ambiguous units from the given sentence, and extract relevant units from the context based on their discourse relations. Finally, we generate a content plan to rewrite the sentence by enriching each ambiguous unit with its relevant units. Experimental results demonstrate that our approach is competitive for sentence decontextualisation, producing sentences that exhibit better semantic integrity and discourse coherence, outperforming existing methods.