CVAILGJul 4, 2025

FastDINOv2: Frequency Based Curriculum Learning Improves Robustness and Training Speed

arXiv:2507.03779v23 citationsh-index: 2Has Code
Originality Incremental advance
AI Analysis

This work addresses the problem of computational inefficiency for practitioners needing to reproduce foundation models on private data or new modalities, offering an incremental improvement in training speed and robustness.

The paper tackles the high computational cost of reproducing large-scale vision foundation models like DINOv2 by proposing a pre-training strategy that accelerates convergence and improves robustness. It reduces pre-training time by 1.6x and FLOPs by 2.25x while matching robustness on corruption benchmarks and maintaining competitive linear probing performance.

Large-scale vision foundation models such as DINOv2 boast impressive performances by leveraging massive architectures and training datasets. But numerous scenarios require practitioners to reproduce those pre-training solutions, such as on private data, new modalities, or simply for scientific questioning--which is currently extremely demanding computation-wise. We thus propose a novel pre-training strategy for DINOv2 that simultaneously accelerates convergence--and strengthens robustness to common corruptions as a by-product. Our approach involves a frequency filtering curriculum--low-frequency being seen first--and the Gaussian noise patching augmentation. Applied to a ViT-B/16 backbone trained on ImageNet-1K, while pre-training time and FLOPs are reduced by 1.6x and 2.25x, our method still achieves matching robustness in corruption benchmarks (ImageNet-C) and maintains competitive linear probing performance compared with baseline. This dual benefit of efficiency and robustness makes large-scale self-supervised foundation modeling more attainable, while opening the door to novel exploration around data curriculum and augmentation as means to improve self-supervised learning models robustness. The code is available at https://github.com/KevinZ0217/fast_dinov2

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes