CLLGJun 3

CRAFT: Cost-aware Refinement And Front-aware Tuning of Prompts

arXiv:2606.0466131.9
Predicted impact top 31% in CL · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners using LLMs, CRAFT enables post-search selection of accuracy-cost trade-offs, eliminating the need to pre-commit to a fixed weight.

CRAFT is a Pareto-front prompt optimizer that finds prompts balancing accuracy and token cost, avoiding the scalarization collapse of weighted-sum methods. Across six benchmarks, it achieves both high-accuracy and low-cost regions, while baselines concentrate in narrower areas.

Prompts tuned for accuracy often grow long, raising inference cost on every model call. The best accuracy-cost trade-off depends on the task and the budget, so prompt optimization is a search over the Pareto front of accuracy and prompt-token cost rather than for one prompt. The usual shortcut, collapsing the objectives into a weighted sum, fixes the trade-off weight before search and often recovers only a narrow region of the front, a failure we call scalarization collapse. We present CRAFT (Cost-aware Refinement And Front-aware Tuning), a Pareto-front prompt optimizer that treats target-LLM validation calls as the scarce resource and allocates them to candidates near the optimistic candidate front. Each round, complementary accuracy-oriented and cost-oriented generators propose edits, Pareto-gap acquisition spends the per-round validation budget, and NSGA-II retention keeps a spread-out population. Across six classification and reasoning benchmarks, CRAFT's retained fronts reach both high-accuracy and low-cost regions, while accuracy-only, cost-only, and weighted-sum baselines each concentrate in narrower regions. The accuracy-cost trade-off becomes a post-search choice, not a pre-search weight.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes