CLAug 3, 2025

MOPrompt: Multi-objective Semantic Evolution for Prompt Optimization

Sara Câmara, Eduardo Luz, Valéria Carvalho, Ivan Meneghini, Gladston Moreira

arXiv:2508.01541v18.33 citationsh-index: 2STIL

Originality Incremental advance

AI Analysis

This addresses the efficiency-effectiveness trade-off in prompt optimization for deploying LLMs in real-world applications, representing an incremental improvement over single-objective methods.

The paper tackles the challenge of optimizing prompts for Large Language Models by balancing task performance and context size, introducing MOPrompt, a multi-objective evolutionary framework that identifies prompts achieving the same peak accuracy as baselines with a 31% reduction in token length.

Prompt engineering is crucial for unlocking the potential of Large Language Models (LLMs). Still, since manual prompt design is often complex, non-intuitive, and time-consuming, automatic prompt optimization has emerged as a research area. However, a significant challenge in prompt optimization is managing the inherent trade-off between task performance, such as accuracy, and context size. Most existing automated methods focus on a single objective, typically performance, thereby failing to explore the critical spectrum of efficiency and effectiveness. This paper introduces the MOPrompt, a novel Multi-objective Evolutionary Optimization (EMO) framework designed to optimize prompts for both accuracy and context size (measured in tokens) simultaneously. Our framework maps the Pareto front of prompt solutions, presenting practitioners with a set of trade-offs between context size and performance, a crucial tool for deploying Large Language Models (LLMs) in real-world applications. We evaluate MOPrompt on a sentiment analysis task in Portuguese, using Gemma-2B and Sabiazinho-3 as evaluation models. Our findings show that MOPrompt substantially outperforms the baseline framework. For the Sabiazinho model, MOPrompt identifies a prompt that achieves the same peak accuracy (0.97) as the best baseline solution, but with a 31% reduction in token length.

View on arXiv PDF

Similar