CL LGMar 18, 2024

CICLe: Conformal In-Context Learning for Largescale Multi-Class Food Risk Classification

Korbinian Randl, John Pavlopoulos, Aron Henriksson, Tony Lindgren

arXiv:2403.11904v315.229 citationsh-index: 5Has CodeACL

Originality Synthesis-oriented

AI Analysis

This work addresses food safety monitoring for public health, but it is incremental as it builds on existing methods like Conformal Prediction and LLM prompting.

The authors tackled the problem of automatically detecting food risks from recall announcements by publishing a dataset of 7,546 labeled texts and benchmarking various models, finding that Logistic Regression with tf-idf outperformed Transformer models on low-support classes and that a Conformal Prediction-based LLM-in-the-loop framework improved performance while reducing energy consumption.

Contaminated or adulterated food poses a substantial risk to human health. Given sets of labeled web texts for training, Machine Learning and Natural Language Processing can be applied to automatically detect such risks. We publish a dataset of 7,546 short texts describing public food recall announcements. Each text is manually labeled, on two granularity levels (coarse and fine), for food products and hazards that the recall corresponds to. We describe the dataset and benchmark naive, traditional, and Transformer models. Based on our analysis, Logistic Regression based on a tf-idf representation outperforms RoBERTa and XLM-R on classes with low support. Finally, we discuss different prompting strategies and present an LLM-in-the-loop framework, based on Conformal Prediction, which boosts the performance of the base classifier while reducing energy consumption compared to normal prompting.

View on arXiv PDF Code

Similar