LGMay 21, 2025

Model Merging is Secretly Certifiable: Non-Vacuous Generalisation Bounds for Low-Shot Learning

arXiv:2505.15798v1h-index: 12
Originality Highly original
AI Analysis

This work addresses the problem of certifying trustworthy AI in high-stakes applications like medicine and security, offering a novel approach that is incremental but impactful for practical certification.

The paper tackles the challenge of certifying IID generalization for deep networks in low-data settings by linking model fusion methods to generalization certificates, showing that minor adjustments to existing strategies yield non-trivial guarantees with as few as 100 examples using models like VIT-B and mistral-7B.

Certifying the IID generalisation ability of deep networks is the first of many requirements for trusting AI in high-stakes applications from medicine to security. However, when instantiating generalisation bounds for deep networks it remains challenging to obtain non-vacuous guarantees, especially when applying contemporary large models on the small scale data prevalent in such high-stakes fields. In this paper, we draw a novel connection between a family of learning methods based on model fusion and generalisation certificates, and surprisingly show that with minor adjustment several existing learning strategies already provide non-trivial generalisation guarantees. Essentially, by focusing on data-driven learning of downstream tasks by fusion rather than fine-tuning, the certified generalisation gap becomes tiny and independent of the base network size, facilitating its certification. Our results show for the first time non-trivial generalisation guarantees for learning with as low as 100 examples, while using vision models such as VIT-B and language models such as mistral-7B. This observation is significant as it has immediate implications for facilitating the certification of existing systems as trustworthy, and opens up new directions for research at the intersection of practice and theory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes