CV AISep 2, 2025

Unsupervised Training of Vision Transformers with Synthetic Negatives

Nikolaos Giakoumoglou, Andreas Floros, Kleanthis Marios Papadopoulos, Tania Stathaki

arXiv:2509.02024v13.6h-index: 8

Originality Synthesis-oriented

AI Analysis

This addresses a neglected aspect in self-supervised learning for vision transformers, but it is incremental as it builds on existing synthetic negative techniques.

The paper tackles the problem of improving vision transformer representation learning by integrating synthetic hard negatives, resulting in performance improvements for DeiT-S and Swin-T architectures.

This paper does not introduce a novel method per se. Instead, we address the neglected potential of hard negative samples in self-supervised learning. Previous works explored synthetic hard negatives but rarely in the context of vision transformers. We build on this observation and integrate synthetic hard negatives to improve vision transformer representation learning. This simple yet effective technique notably improves the discriminative power of learned representations. Our experiments show performance improvements for both DeiT-S and Swin-T architectures.

View on arXiv PDF

Similar