Improving Fairness of Large Language Models in Multi-document Summarization
This addresses fairness issues in summarization systems that can mislead users, such as in product reviews, but is incremental as it builds on existing fairness measurement frameworks.
The paper tackles the problem of fairness in multi-document summarization by proposing FairPO, a preference tuning method that improves both summary-level and corpus-level fairness, outperforming strong baselines while maintaining summary quality.
Fairness in multi-document summarization (MDS) is crucial for providing comprehensive views across documents with diverse social attribute values, which can significantly impact decision-making. For example, a summarization system that tends to overrepresent negative reviews of products can mislead customers into disregarding good products. Previous works measure fairness in MDS at two levels: summary-level and corpus-level. While summary-level fairness focuses on individual summaries, corpus-level fairness focuses on a corpus of summaries. Recent methods primarily focus on summary-level fairness. We propose FairPO, a preference tuning method that focuses on both summary-level and corpus-level fairness in MDS. To improve summary-level fairness, we propose to generate preference pairs by perturbing document sets. To improve corpus-level fairness, we propose fairness-aware preference tuning by dynamically adjusting the weights of preference pairs. Our experiments show that FairPO outperforms strong baselines while maintaining the critical qualities of summaries. The code is available at https://github.com/leehaoyuan/coverage_fairnes.