CLJan 25, 2021

With Measured Words: Simple Sentence Selection for Black-Box Optimization of Sentence Compression Algorithms

arXiv:2101.10096v1800 citations
Originality Incremental advance
AI Analysis

This addresses efficiency and quality in text summarization for NLP applications, but it is incremental as it builds on existing compression methods.

The paper tackles the problem of selecting which sentences to compress in black-box sentence compression algorithms to maximize compression rate and quality, showing that their optimizer improves accuracy and Rouge-F1-score on three datasets.

Sentence Compression is the task of generating a shorter, yet grammatical version of a given sentence, preserving the essence of the original sentence. This paper proposes a Black-Box Optimizer for Compression (B-BOC): given a black-box compression algorithm and assuming not all sentences need be compressed -- find the best candidates for compression in order to maximize both compression rate and quality. Given a required compression ratio, we consider two scenarios: (i) single-sentence compression, and (ii) sentences-sequence compression. In the first scenario, our optimizer is trained to predict how well each sentence could be compressed while meeting the specified ratio requirement. In the latter, the desired compression ratio is applied to a sequence of sentences (e.g., a paragraph) as a whole, rather than on each individual sentence. To achieve that, we use B-BOC to assign an optimal compression ratio to each sentence, then cast it as a Knapsack problem, which we solve using bounded dynamic programming. We evaluate B-BOC on both scenarios on three datasets, demonstrating that our optimizer improves both accuracy and Rouge-F1-score compared to direct application of other compression algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes