Adversarial Network Compression
This addresses computational efficiency for deploying deep models, but it is incremental as it builds on existing knowledge transfer methods.
The paper tackles neural network compression by transferring knowledge from a large teacher model to a smaller student using an adversarial approach without labels, achieving small accuracy drops and state-of-the-art results on five datasets.
Neural network compression has recently received much attention due to the computational requirements of modern deep models. In this work, our objective is to transfer knowledge from a deep and accurate model to a smaller one. Our contributions are threefold: (i) we propose an adversarial network compression approach to train the small student network to mimic the large teacher, without the need for labels during training; (ii) we introduce a regularization scheme to prevent a trivially-strong discriminator without reducing the network capacity and (iii) our approach generalizes on different teacher-student models. In an extensive evaluation on five standard datasets, we show that our student has small accuracy drop, achieves better performance than other knowledge transfer approaches and it surpasses the performance of the same network trained with labels. In addition, we demonstrate state-of-the-art results compared to other compression strategies.