PHORECAST: Enabling AI Understanding of Public Health Outreach Across Populations
This provides a new dataset for advancing socially aware AI in public health, addressing a lack of comprehensive data in this high-stakes domain.
The authors tackled the problem of AI understanding nuanced human responses to public health messaging by introducing PHORECAST, a multimodal dataset that enables fine-grained prediction of individual and community engagement patterns, supporting tasks like multimodal understanding and social forecasting.
Understanding how diverse individuals and communities respond to persuasive messaging holds significant potential for advancing personalized and socially aware machine learning. While Large Vision and Language Models (VLMs) offer promise, their ability to emulate nuanced, heterogeneous human responses, particularly in high stakes domains like public health, remains underexplored due in part to the lack of comprehensive, multimodal dataset. We introduce PHORECAST (Public Health Outreach REceptivity and CAmpaign Signal Tracking), a multimodal dataset curated to enable fine-grained prediction of both individuallevel behavioral responses and community-wide engagement patterns to health messaging. This dataset supports tasks in multimodal understanding, response prediction, personalization, and social forecasting, allowing rigorous evaluation of how well modern AI systems can emulate, interpret, and anticipate heterogeneous public sentiment and behavior. By providing a new dataset to enable AI advances for public health, PHORECAST aims to catalyze the development of models that are not only more socially aware but also aligned with the goals of adaptive and inclusive health communication