AILS-NTUA at SemEval-2024 Task 6: Efficient model tuning for hallucination detection and analysis
This work addresses hallucination detection in natural language generation, an incremental improvement for AI safety and evaluation in NLP tasks.
The paper tackled the problem of detecting fluent overgeneration hallucinations in text by fine-tuning and ensembling pre-trained models, achieving accuracies of 77.8% and 79.9% on model-agnostic and model-aware datasets, which outperformed the baseline but were lower than top competition results of 84.7% and 81.3%.
In this paper, we present our team's submissions for SemEval-2024 Task-6 - SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes. The participants were asked to perform binary classification to identify cases of fluent overgeneration hallucinations. Our experimentation included fine-tuning a pre-trained model on hallucination detection and a Natural Language Inference (NLI) model. The most successful strategy involved creating an ensemble of these models, resulting in accuracy rates of 77.8% and 79.9% on model-agnostic and model-aware datasets respectively, outperforming the organizers' baseline and achieving notable results when contrasted with the top-performing results in the competition, which reported accuracies of 84.7% and 81.3% correspondingly.