LGAINEMar 14, 2024

Towards a theory of model distillation

arXiv:2403.09053v214 citations
Originality Incremental advance
AI Analysis

This work addresses foundational questions in machine learning about model compression and interpretability, though it is incremental in building on existing PAC-learning frameworks.

The paper tackles the problem of understanding the fundamental limits and requirements of model distillation, proposing a general theory called PAC-distillation and showing that distillation can be more efficient than learning from scratch, with algorithms to distill neural networks into decision trees.

Distillation is the task of replacing a complicated machine learning model with a simpler model that approximates the original [BCNM06,HVD15]. Despite many practical applications, basic questions about the extent to which models can be distilled, and the runtime and amount of data needed to distill, remain largely open. To study these questions, we initiate a general theory of distillation, defining PAC-distillation in an analogous way to PAC-learning [Val84]. As applications of this theory: (1) we propose new algorithms to extract the knowledge stored in the trained weights of neural networks -- we show how to efficiently distill neural networks into succinct, explicit decision tree representations when possible by using the ``linear representation hypothesis''; and (2) we prove that distillation can be much cheaper than learning from scratch, and make progress on characterizing its complexity.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes