Position: Vector Prompt Interfaces Should Be Exposed to Enable Customization of Large Language Models
This paper addresses the bottleneck of scalable, stable, and inference-only customization for large language models, which is crucial for their real-world deployment.
This paper argues that large language model (LLM) providers should expose vector prompt inputs for customization, rather than relying solely on text prompts. They provide evidence that vector prompt tuning improves with increasing supervision, while text-based prompt optimization saturates early, and that vector prompts show dense, global attention patterns.
As large language models (LLMs) transition from research prototypes to real-world systems, customization has emerged as a central bottleneck. While text prompts can already customize LLM behavior, we argue that text-only prompting does not constitute a suitable control interface for scalable, stable, and inference-only customization. This position paper argues that model providers should expose \emph{vector prompt inputs} as part of the public interface for customizing LLMs. We support this position with diagnostic evidence showing that vector prompt tuning continues to improve with increasing supervision whereas text-based prompt optimization saturates early, and that vector prompts exhibit dense, global attention patterns indicative of a distinct control mechanism. We further discuss why inference-only customization is increasingly important under realistic deployment constraints, and why exposing vector prompts need not fundamentally increase model leakage risk under a standard black-box threat model. We conclude with a call to action for the community to rethink prompt interfaces as a core component of LLM customization.