LG AIJul 24, 2025

C2G-KD: PCA-Constrained Generator for Data-Free Knowledge Distillation

arXiv:2507.18533v1h-index: 4

Originality Incremental advance

AI Analysis

This addresses data privacy and availability issues in machine learning, though it is incremental as it builds on existing knowledge distillation and generative methods.

The paper tackles the problem of data-free knowledge distillation by introducing C2G-KD, a framework that trains a class-conditional generator using PCA constraints from minimal real data to produce synthetic samples, achieving effective distillation on MNIST.

We introduce C2G-KD, a data-free knowledge distillation framework where a class-conditional generator is trained to produce synthetic samples guided by a frozen teacher model and geometric constraints derived from PCA. The generator never observes real training data but instead learns to activate the teacher's output through a combination of semantic and structural losses. By constraining generated samples to lie within class-specific PCA subspaces estimated from as few as two real examples per class, we preserve topological consistency and diversity. Experiments on MNIST show that even minimal class structure is sufficient to bootstrap useful synthetic training pipelines.

View on arXiv PDF

Similar