CVFeb 17, 2024

On Good Practices for Task-Specific Distillation of Large Pretrained Visual Models

arXiv:2402.11305v23 citationsh-index: 4Trans. Mach. Learn. Res.
Originality Incremental advance
AI Analysis

This work addresses the need for efficient specialized models in real-world applications, but it is incremental as it refines existing distillation methods.

The paper tackles the problem of distilling large pretrained visual models into compact task-specific ones, showing that existing practices are suboptimal and proposing new guidelines, with a Mixup variant improving distillation without engineered prompts.

Large pretrained visual models exhibit remarkable generalization across diverse recognition tasks. Yet, real-world applications often demand compact models tailored to specific problems. Variants of knowledge distillation have been devised for such a purpose, enabling task-specific compact models (the students) to learn from a generic large pretrained one (the teacher). In this paper, we show that the excellent robustness and versatility of recent pretrained models challenge common practices established in the literature, calling for a new set of optimal guidelines for task-specific distillation. To address the lack of samples in downstream tasks, we also show that a variant of Mixup based on stable diffusion complements standard data augmentation. This strategy eliminates the need for engineered text prompts and improves distillation of generic models into streamlined specialized networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes