Task-Feature Collaborative Learning with Application to Personalized Attribute Prediction
This work addresses negative transfer in multi-task learning, which is a problem for improving performance in scenarios with insufficient training samples, though it appears incremental as it builds on existing MTL methods by adding feature-task collaboration.
The paper tackles negative transfer in multi-task learning by proposing Task-Feature Collaborative Learning (TFCL), which models collaborative grouping of features and tasks to suppress inter-group sharing, and demonstrates effectiveness on simulated and real-world datasets for personalized attribute prediction.
As an effective learning paradigm against insufficient training samples, Multi-Task Learning (MTL) encourages knowledge sharing across multiple related tasks so as to improve the overall performance. In MTL, a major challenge springs from the phenomenon that sharing the knowledge with dissimilar and hard tasks, known as negative transfer, often results in a worsened performance. Though a substantial amount of studies have been carried out against the negative transfer, most of the existing methods only model the transfer relationship as task correlations, with the transfer across features and tasks left unconsidered. Different from the existing methods, our goal is to alleviate negative transfer collaboratively across features and tasks. To this end, we propose a novel multi-task learning method called Task-Feature Collaborative Learning (TFCL). Specifically, we first propose a base model with a heterogeneous block-diagonal structure regularizer to leverage the collaborative grouping of features and tasks and suppressing inter-group knowledge sharing. We then propose an optimization method for the model. Extensive theoretical analysis shows that our proposed method has the following benefits: (a) it enjoys the global convergence property and (b) it provides a block-diagonal structure recovery guarantee. As a practical extension, we extend the base model by allowing overlapping features and differentiating the hard tasks. We further apply it to the personalized attribute prediction problem with fine-grained modeling of user behaviors. Finally, experimental results on both simulated dataset and real-world datasets demonstrate the effectiveness of our proposed method