CL AIApr 26, 2025

A Simple Ensemble Strategy for LLM Inference: Towards More Stable Text Classification

arXiv:2504.18884v210 citationsh-index: 4NLDB

Originality Synthesis-oriented

AI Analysis

This addresses reproducibility issues in LLM-based text classification, but it is incremental as it applies a known ensemble technique to a new context.

The study tackled the problem of variability and reproducibility in LLM inference for sentiment analysis by introducing a simple ensemble strategy, resulting in an 18.6% reduction in RMSE compared to using a large model with a single attempt.

With the advance of large language models (LLMs), LLMs have been utilized for the various tasks. However, the issues of variability and reproducibility of results from each trial of LLMs have been largely overlooked in existing literature while actual human annotation uses majority voting to resolve disagreements among annotators. Therefore, this study introduces the straightforward ensemble strategy to a sentiment analysis using LLMs. As the results, we demonstrate that the ensemble of multiple inference using medium-sized LLMs produces more robust and accurate results than using a large model with a single attempt with reducing RMSE by 18.6%.

View on arXiv PDF

Similar