Meta-learning of shared linear representations beyond well-specified linear regression
This work addresses the challenge of learning shared structures in multi-task and meta-learning for more general convex objectives, representing an incremental advancement over previous methods limited to linear regression.
The paper tackles the problem of learning shared linear representations in multi-task and meta-learning beyond well-specified linear regression, showing that rank and clustered regularized estimators recover such structure under mild assumptions with sufficient samples and tasks, and providing a polynomial-time algorithm for this setting.
Motivated by multi-task and meta-learning approaches, we consider the problem of learning structure shared by tasks or users, such as shared low-rank representations or clustered structures. While all previous works focus on well-specified linear regression, we consider more general convex objectives, where the structural low-rank and cluster assumptions are expressed on the optima of each function. We show that under mild assumptions such as \textit{Hessian concentration} and \textit{noise concentration at the optimum}, rank and clustered regularized estimators recover such structure, provided the number of samples per task and the number of tasks are large enough. We then study the problem of recovering the subspace in which all the solutions lie, in the setting where there is only a single sample per task: we show that in that case, the rank-constrained estimator can recover the subspace, but that the number of tasks needs to scale exponentially large with the dimension of the subspace. Finally, we provide a polynomial-time algorithm via nuclear norm constraints for learning a shared linear representation in the context of convex learning objectives.