CrossPT: Exploring Cross-Task Transferability through Multi-Task Prompt Tuning
This work addresses the need for efficient knowledge transfer in multi-task settings for NLP practitioners, though it is incremental as it builds on existing prompt tuning methods.
The paper tackles the problem of single-task prompt tuning's inability to share knowledge across related tasks by proposing CrossPT, a modular framework for multi-task prompt tuning that enables controlled knowledge transfer. Results on GLUE and related benchmarks show that CrossPT achieves higher accuracy and robustness, especially in low-resource scenarios, while maintaining parameter efficiency.
Prompt tuning offers a parameter-efficient way to adapt large pre-trained language models to new tasks, but most existing approaches are designed for single-task settings, failing to share knowledge across related tasks. We propose Cross-task Prompt Tuning (CrossPT), a modular framework for multi-task prompt tuning that enables controlled knowledge transfer while maintaining task-specific specialization. CrossPT decomposes each target prompt into shared, pre-trained source prompts and task-specific private prompts, combined via a learned attention mechanism. To support robust transfer, we systematically investigate key design factors including prompt initialization, balancing shared and private prompts, number of source prompts, learning rates, task prefixes, and label semantics. Empirical results on GLUE and related benchmarks show that CrossPT achieves higher accuracy and robustness compared to traditional prompt tuning and related methods, particularly in low-resource scenarios, while maintaining strong parameter efficiency.