ML LGJun 1

Provable Data Scaling Law for Meta Learning via Complexity Minimization

Kazuto Fukuchi, Ryuichiro Hataya, Kota Matsui

arXiv:2606.0200890.7

AI Analysis

This work provides a theoretical foundation for the empirical observation that larger pre-training data improves downstream sample efficiency in meta-learning, addressing a gap in existing theoretical frameworks.

The paper introduces a complexity minimization framework for meta-representation learning that provably captures the scaling behavior where downstream sample efficiency improves with increased pre-training data. The framework is shown to reduce error rates in few-shot adaptation as meta-training data grows, and empirical results demonstrate that adding complexity regularization to existing meta-learning methods consistently improves downstream sample efficiency.

Pre-training has become a fundamental paradigm in modern machine learning, with one of its key empirical benefits being reduced downstream sample complexity as the scale of pre-training data increases. However, existing theoretical frameworks for pre-training do not fully explain this phenomenon. In this paper, we introduce complexity minimization, a novel meta-representation learning framework designed to enable theoretical analysis of this scaling behavior, which learns representations by evaluating the downstream model complexity best suited to each domain and minimizing the worst-case such complexity across source domains. Our end-to-end theoretical analysis, spanning pre-training through downstream regression, shows that this framework provably captures this scaling behavior; in particular, we show that the error rate of few-shot adaptation improves as the amount of meta-training data grows. Empirically, we demonstrate that incorporating complexity regularization into existing meta-learning methods consistently improves downstream sample efficiency.

View on arXiv PDF

Similar