Clickbait Spoiling via Question Answering and Passage Retrieval
This addresses the problem of misleading online content for users by providing automated spoilers, though it is incremental as it builds on existing question answering and retrieval methods.
The paper tackles the task of clickbait spoiling by generating short texts to satisfy curiosity from clickbait posts, achieving an accuracy of 80% for spoiler type classification and showing that DeBERTa-large outperforms other models in spoiler generation.
We introduce and study the task of clickbait spoiling: generating a short text that satisfies the curiosity induced by a clickbait post. Clickbait links to a web page and advertises its contents by arousing curiosity instead of providing an informative summary. Our contributions are approaches to classify the type of spoiler needed (i.e., a phrase or a passage), and to generate appropriate spoilers. A large-scale evaluation and error analysis on a new corpus of 5,000 manually spoiled clickbait posts -- the Webis Clickbait Spoiling Corpus 2022 -- shows that our spoiler type classifier achieves an accuracy of 80%, while the question answering model DeBERTa-large outperforms all others in generating spoilers for both types.