CV LGJan 21, 2021

Pre-training without Natural Images

Hirokatsu Kataoka, Kazushige Okayasu, Asato Matsumoto, Eisuke Yamagata, Ryosuke Yamada, Nakamasa Inoue, Akio Nakamura, Yutaka Satoh

arXiv:2101.08515v126.2149 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of reducing reliance on large, human-annotated natural image datasets for pre-training in computer vision, though it is an incremental step as it does not fully outperform existing methods.

The paper tackles the problem of pre-training convolutional neural networks without natural images by introducing Formula-driven Supervised Learning, which uses automatically generated fractal images to create an infinite dataset, and shows that models pre-trained with this FractalDB can partially surpass the accuracy of ImageNet/Places pre-trained models in some settings.

Is it possible to use convolutional neural networks pre-trained without any natural images to assist natural image understanding? The paper proposes a novel concept, Formula-driven Supervised Learning. We automatically generate image patterns and their category labels by assigning fractals, which are based on a natural law existing in the background knowledge of the real world. Theoretically, the use of automatically generated images instead of natural images in the pre-training phase allows us to generate an infinite scale dataset of labeled images. Although the models pre-trained with the proposed Fractal DataBase (FractalDB), a database without natural images, does not necessarily outperform models pre-trained with human annotated datasets at all settings, we are able to partially surpass the accuracy of ImageNet/Places pre-trained models. The image representation with the proposed FractalDB captures a unique feature in the visualization of convolutional layers and attentions.

View on arXiv PDF Code

Similar