LGITMLOct 21, 2020

Conditional Mutual Information-Based Generalization Bound for Meta Learning

arXiv:2010.10886v23 citations
Originality Incremental advance
AI Analysis

This work provides a theoretical foundation for understanding generalization in meta-learning, which is incremental as it builds on existing CMI frameworks.

The paper tackles the problem of bounding generalization performance in meta-learning by extending the conditional mutual information framework to meta-learning, resulting in an explicit bound with two CMI terms, and demonstrates its advantages over prior bounds through a numerical example.

Meta-learning optimizes an inductive bias---typically in the form of the hyperparameters of a base-learning algorithm---by observing data from a finite number of related tasks. This paper presents an information-theoretic bound on the generalization performance of any given meta-learner, which builds on the conditional mutual information (CMI) framework of Steinke and Zakynthinou (2020). In the proposed extension to meta-learning, the CMI bound involves a training \textit{meta-supersample} obtained by first sampling $2N$ independent tasks from the task environment, and then drawing $2M$ independent training samples for each sampled task. The meta-training data fed to the meta-learner is modelled as being obtained by randomly selecting $N$ tasks from the available $2N$ tasks and $M$ training samples per task from the available $2M$ training samples per task. The resulting bound is explicit in two CMI terms, which measure the information that the meta-learner output and the base-learner output provide about which training data are selected, given the entire meta-supersample. Finally, we present a numerical example that illustrates the merits of the proposed bound in comparison to prior information-theoretic bounds for meta-learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes