LGOct 2, 2025

Knowledge Distillation Detection for Open-weights Models

arXiv:2510.02302v11 citationsh-index: 4Has Code
Originality Incremental advance
AI Analysis

This addresses model provenance and unauthorized replication concerns for AI practitioners, but is incremental as it builds on existing distillation and detection methods.

The paper tackles the problem of detecting whether a student model has been distilled from a teacher model, using only the student's weights and teacher's API, and introduces a model-agnostic framework that improves detection accuracy by 59.6% on CIFAR-10, 71.2% on ImageNet, and 20.0% for text-to-image generation over baselines.

We propose the task of knowledge distillation detection, which aims to determine whether a student model has been distilled from a given teacher, under a practical setting where only the student's weights and the teacher's API are available. This problem is motivated by growing concerns about model provenance and unauthorized replication through distillation. To address this task, we introduce a model-agnostic framework that combines data-free input synthesis and statistical score computation for detecting distillation. Our approach is applicable to both classification and generative models. Experiments on diverse architectures for image classification and text-to-image generation show that our method improves detection accuracy over the strongest baselines by 59.6% on CIFAR-10, 71.2% on ImageNet, and 20.0% for text-to-image generation. The code is available at https://github.com/shqii1j/distillation_detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes