Shimam Amer Chasib

2.7CLFeb 12, 2025Code

Data Augmentation to Improve Large Language Models in Food Hazard and Product Detection

Areeg Fahad Rasheed, M. Zarkoosh, Shimam Amer Chasib et al.

The primary objective of this study is to demonstrate the impact of data augmentation using ChatGPT-4o-mini on food hazard and product analysis. The augmented data is generated using ChatGPT-4o-mini and subsequently used to train two large language models: RoBERTa-base and Flan-T5-base. The models are evaluated on test sets. The results indicate that using augmented data helped improve model performance across key metrics, including recall, F1 score, precision, and accuracy, compared to using only the provided dataset. The full code, including model training and the augmented dataset, can be found in this repository: https://github.com/AREEG94FAHAD/food-hazard-prdouct-cls

Shimam Amer Chasib

1 Paper