Explorations in Self-Supervised Learning: Dataset Composition Testing for Object Classification
This provides incremental insights for researchers optimizing self-supervised learning in computer vision, particularly for object classification tasks with varied image qualities.
This paper investigates how different image characteristics in pretraining datasets affect self-supervised learning for object classification, finding that depth-pretrained models work better on low-resolution images while RGB-pretrained models excel on high-resolution images, and that increased luminosity improves low-resolution performance without harming high-resolution results.
This paper investigates the impact of sampling and pretraining using datasets with different image characteristics on the performance of self-supervised learning (SSL) models for object classification. To do this, we sample two apartment datasets from the Omnidata platform based on modality, luminosity, image size, and camera field of view and use them to pretrain a SimCLR model. The encodings generated from the pretrained model are then transferred to a supervised Resnet-50 model for object classification. Through A/B testing, we find that depth pretrained models are more effective on low resolution images, while RGB pretrained models perform better on higher resolution images. We also discover that increasing the luminosity of training images can improve the performance of models on low resolution images without negatively affecting their performance on higher resolution images.