On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters
For ML practitioners, this reframes PEFT from a budget-saving technique to a scalable approach for personalizing large models, though the work is largely conceptual with no concrete performance numbers.
The paper repositions PEFT as a mechanism for persistent personal models on shared foundation models, exploring scaling axes (up, down, out) and introducing MinT infrastructure. Results suggest PEFT can serve as a compact substrate for instance-specific behavior rather than just a cheaper alternative to full fine-tuning.
Parameter-efficient fine-tuning (PEFT) is usually treated as a cheaper alternative to full fine-tuning. We study a broader role: small trainable adapters as persistent local state on top of strong shared foundation models. In this framing, the base model provides shared competence while adapters carry instance-specific behavior such as preferences, skills, tool habits, and memory-like updates. We organize the problem around three scaling axes: Scale Up, where stronger shared priors make small local updates more useful; Scale Down, where we study how small adapters can be while remaining reliable; and Scale Out, where many persistent adapted instances coexist. MinT provides one infrastructure example for managing adapter identity, revision, provenance, evaluation, and serving residency. Together, the results suggest that PEFT can be a compact substrate for persistent personal models rather than only a budget substitute for full fine-tuning.