'Don't Get Too Technical with Me': A Discourse Structure-Based Framework for Science Journalism
This addresses the challenge of making scientific findings accessible to the general public, though it is incremental as it builds on existing text generation methods.
The paper tackles the problem of automatically generating science journalism articles from technical papers by introducing a new dataset (SciTechNews) and a framework that integrates discourse structure and metadata, demonstrating that it outperforms baselines like ChatGPT in content planning, simplification, and coherence.
Science journalism refers to the task of reporting technical findings of a scientific paper as a less technical news article to the general public audience. We aim to design an automated system to support this real-world task (i.e., automatic science journalism) by 1) introducing a newly-constructed and real-world dataset (SciTechNews), with tuples of a publicly-available scientific paper, its corresponding news article, and an expert-written short summary snippet; 2) proposing a novel technical framework that integrates a paper's discourse structure with its metadata to guide generation; and, 3) demonstrating with extensive automatic and human experiments that our framework outperforms other baseline methods (e.g. Alpaca and ChatGPT) in elaborating a content plan meaningful for the target audience, simplifying the information selected, and producing a coherent final report in a layman's style.