Provable Guarantees for Gradient-Based Meta-Learning
This provides a theoretical foundation for scalable meta-learning in deep learning, addressing a gap in provable guarantees for practitioners.
The paper tackles the problem of meta-learning by bridging gradient-based methods with regularization-based transfer, achieving provable sample efficiency and generalization bounds that improve with task similarity, while matching a lower bound up to a constant factor.
We study the problem of meta-learning through the lens of online convex optimization, developing a meta-algorithm bridging the gap between popular gradient-based meta-learning and classical regularization-based multi-task transfer methods. Our method is the first to simultaneously satisfy good sample efficiency guarantees in the convex setting, with generalization bounds that improve with task-similarity, while also being computationally scalable to modern deep learning architectures and the many-task setting. Despite its simplicity, the algorithm matches, up to a constant factor, a lower bound on the performance of any such parameter-transfer method under natural task similarity assumptions. We use experiments in both convex and deep learning settings to verify and demonstrate the applicability of our theory.