CLFeb 8, 2022

Differentiable N-gram Objective on Abstractive Summarization

Yunqi Zhu, Xuebing Yang, Yuanyuan Wu, Mingjin Zhu, Wensheng Zhang

arXiv:2202.04003v60.3Has Code

Originality Incremental advance

AI Analysis

This addresses a methodological gap for researchers in sequence-to-sequence tasks, though it appears incremental as it builds on existing n-gram optimization approaches.

The paper tackles the discrepancy between training with cross-entropy loss and evaluating with ROUGE metrics in abstractive summarization by introducing differentiable n-gram objectives, resulting in decent ROUGE score enhancements on CNN/DM and XSum datasets while outperforming alternative n-gram objectives.

ROUGE is a standard automatic evaluation metric based on n-grams for sequence-to-sequence tasks, while cross-entropy loss is an essential objective of neural network language model that optimizes at a unigram level. We present differentiable n-gram objectives, attempting to alleviate the discrepancy between training criterion and evaluating criterion. The objective maximizes the probabilistic weight of matched sub-sequences, and the novelty of our work is the objective weights the matched sub-sequences equally and does not ceil the number of matched sub-sequences by the ground truth count of n-grams in reference sequence. We jointly optimize cross-entropy loss and the proposed objective, providing decent ROUGE score enhancement over abstractive summarization dataset CNN/DM and XSum, outperforming alternative n-gram objectives.

View on arXiv PDF Code

Similar