CLJun 28, 2024

Mining Reasons For And Against Vaccination From Unstructured Data Using Nichesourcing and AI Data Augmentation

arXiv:2406.19951v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of extracting subjective vaccination-related reasons from non-structured data, which is incremental as it builds on existing AI and data augmentation techniques for a specific domain.

The researchers tackled the problem of predicting reasons for and against vaccination from unstructured text by creating the RFAV dataset, annotated through nichesourcing and augmented with GPT-4 and GPT-3.5-Turbo, and explored the impact of this artificially augmented data using in-context learning.

We present Reasons For and Against Vaccination (RFAV), a dataset for predicting reasons for and against vaccination, and scientific authorities used to justify them, annotated through nichesourcing and augmented using GPT4 and GPT3.5-Turbo. We show how it is possible to mine these reasons in non-structured text, under different task definitions, despite the high level of subjectivity involved and explore the impact of artificially augmented data using in-context learning with GPT4 and GPT3.5-Turbo. We publish the dataset and the trained models along with the annotation manual used to train annotators and define the task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes