SemEval-2025 Task 9: The Food Hazard Detection Challenge
This work addresses food safety monitoring by providing a dataset and methods for detecting hazards from web texts, though it is incremental as it builds on existing NLP techniques.
The paper tackled text-based food hazard detection with long-tail classes by introducing a challenge with two subtasks, finding that large language model-generated synthetic data effectively oversamples long-tail distributions and that fine-tuned systems achieve comparable performance across subtasks.
In this challenge, we explored text-based food hazard prediction with long tail distributed classes. The task was divided into two subtasks: (1) predicting whether a web text implies one of ten food-hazard categories and identifying the associated food category, and (2) providing a more fine-grained classification by assigning a specific label to both the hazard and the product. Our findings highlight that large language model-generated synthetic data can be highly effective for oversampling long-tail distributions. Furthermore, we find that fine-tuned encoder-only, encoder-decoder, and decoder-only systems achieve comparable maximum performance across both subtasks. During this challenge, we gradually released (under CC BY-NC-SA 4.0) a novel set of 6,644 manually labeled food-incident reports.