CL AIApr 9, 2024

Pitfalls of Conversational LLMs on News Debiasing

Ipek Baris Schlicht, Defne Altiok, Maryanne Taouk, Lucie Flek

arXiv:2404.06488v124.384 citationsh-index: 11DELITE

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of automated news debiasing for editors and journalists, but it is incremental as it evaluates existing models rather than proposing new solutions.

The paper tackled the problem of debiasing news editing using conversational LLMs, finding that none of the models were perfect, with some like ChatGPT introducing unnecessary changes that could affect author style and create misinformation, and they performed worse than domain experts in evaluating debiased outputs.

This paper addresses debiasing in news editing and evaluates the effectiveness of conversational Large Language Models in this task. We designed an evaluation checklist tailored to news editors' perspectives, obtained generated texts from three popular conversational models using a subset of a publicly available dataset in media bias, and evaluated the texts according to the designed checklist. Furthermore, we examined the models as evaluator for checking the quality of debiased model outputs. Our findings indicate that none of the LLMs are perfect in debiasing. Notably, some models, including ChatGPT, introduced unnecessary changes that may impact the author's style and create misinformation. Lastly, we show that the models do not perform as proficiently as domain experts in evaluating the quality of debiased outputs.

View on arXiv PDF

Similar