CY AIApr 10, 2024

Accuracy of a Large Language Model in Distinguishing Anti- And Pro-vaccination Messages on Social Media: The Case of Human Papillomavirus Vaccination

Soojong Kim, Kwanho Kim, Claire Wonjeong Jo

arXiv:2404.06731v15.918 citationsh-index: 3Prev med rep

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of efficiently analyzing public health discourse on social media for researchers and health professionals, but it is incremental as it applies an existing LLM to a new dataset.

This research assessed ChatGPT's accuracy in distinguishing anti- and pro-vaccination messages about HPV vaccination on social media, finding high average accuracy (e.g., 0.882 for anti-vaccination long-form) but lower performance for pro-vaccination messages in long formats.

Objective. Vaccination has engendered a spectrum of public opinions, with social media acting as a crucial platform for health-related discussions. The emergence of artificial intelligence technologies, such as large language models (LLMs), offers a novel opportunity to efficiently investigate public discourses. This research assesses the accuracy of ChatGPT, a widely used and freely available service built upon an LLM, for sentiment analysis to discern different stances toward Human Papillomavirus (HPV) vaccination. Methods. Messages related to HPV vaccination were collected from social media supporting different message formats: Facebook (long format) and Twitter (short format). A selection of 1,000 human-evaluated messages was input into the LLM, which generated multiple response instances containing its classification results. Accuracy was measured for each message as the level of concurrence between human and machine decisions, ranging between 0 and 1. Results. Average accuracy was notably high when 20 response instances were used to determine the machine decision of each message: .882 (SE = .021) and .750 (SE = .029) for anti- and pro-vaccination long-form; .773 (SE = .027) and .723 (SE = .029) for anti- and pro-vaccination short-form, respectively. Using only three or even one instance did not lead to a severe decrease in accuracy. However, for long-form messages, the language model exhibited significantly lower accuracy in categorizing pro-vaccination messages than anti-vaccination ones. Conclusions. ChatGPT shows potential in analyzing public opinions on HPV vaccination using social media content. However, understanding the characteristics and limitations of a language model within specific public health contexts remains imperative.

View on arXiv PDF

Similar