IDEAL: Query-Efficient Data-Free Learning from Black-box Models
This addresses the challenge of efficient and cost-effective model compression for scenarios where data privacy or model access is restricted, though it is incremental over prior data-free black-box methods.
The paper tackles the problem of knowledge distillation from black-box models without access to training data or model parameters, proposing IDEAL to reduce query costs; it achieves a 5.83% performance improvement over the baseline DFME on CIFAR10 with only 0.02x the query budget.
Knowledge Distillation (KD) is a typical method for training a lightweight student model with the help of a well-trained teacher model. However, most KD methods require access to either the teacher's training data or model parameters, which is unrealistic. To tackle this problem, recent works study KD under data-free and black-box settings. Nevertheless, these works require a large number of queries to the teacher model, which incurs significant monetary and computational costs. To address these problems, we propose a novel method called \emph{query-effIcient Data-free lEarning from blAck-box modeLs} (IDEAL), which aims to query-efficiently learn from black-box model APIs to train a good student without any real data. In detail, IDEAL trains the student model in two stages: data generation and model distillation. Note that IDEAL does not require any query in the data generation stage and queries the teacher only once for each sample in the distillation stage. Extensive experiments on various real-world datasets show the effectiveness of the proposed IDEAL. For instance, IDEAL can improve the performance of the best baseline method DFME by 5.83% on CIFAR10 dataset with only 0.02x the query budget of DFME.