Exploring Embedding Priors in Prompt-Tuning for Improved Interpretability and Control
This work addresses the problem of embedding collapse in prompt-tuning for researchers, offering insights into interpretability and control, but it is incremental as it builds on existing prompt-tuning methods without introducing a new paradigm.
The study investigated how embedding priors influence prompt-tuning performance and found that priors strongly affect tuned embedding positions, with models capable of using embeddings from different activation space regions, including new ones, while noting distinct activation clusters for distant tasks but similar clusters for related NLP tasks.
Prompt-Tuning is an efficient method for adapting pre-trained language models to new tasks with minimal computational overhead by modifying prompt embeddings. In this work, we investigate how crucial the phenomenon of embedding collapse, frequently observed in Prompt-Tuning, is for the final performance of the model. To address this question, we designed embedding priors and compared them with posteriors of the converged Soft and Deep Prompt-Tuning methods. Our findings suggest that priors strongly affect the position of the tuned embeddings, and models can effectively work with embeddings from different parts of activation spaces, including completely new regions. As the final Prompt-Tuning capabilities are limited, we hypothesize that controllable Prompt-Tuning posteriors may serve as a good starting point for tasks such as chain-of-thought (COT) distillation. Our experiments also show that generated trajectories are not localized in the activation space of the models. However, there are distinct clusters of activations for distant tasks (e.g., NLP and arithmetic), while activations between NLP tasks (e.g., Question-Answering and MLM) lie in the same cluster. These observations raise questions about the importance of a single activation cluster for the generalization abilities of large language models.