CV AI LGFeb 3, 2023

Blockwise Self-Supervised Learning at Scale

Shoaib Ahmed Siddiqui, David Krueger, Yann LeCun, Stéphane Deny

arXiv:2302.01647v214.525 citationsh-index: 137Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for more efficient and scalable training methods in machine learning, with potential implications for hardware design and neuroscience, though it is incremental as it builds on existing self-supervised techniques.

The paper tackles the problem of reducing reliance on full backpropagation in deep networks by proposing a blockwise self-supervised learning method, achieving a top-1 ImageNet accuracy of 70.48% with only a 1.1% drop compared to end-to-end backpropagation.

Current state-of-the-art deep networks are all powered by backpropagation. In this paper, we explore alternatives to full backpropagation in the form of blockwise learning rules, leveraging the latest developments in self-supervised learning. We show that a blockwise pretraining procedure consisting of training independently the 4 main blocks of layers of a ResNet-50 with Barlow Twins' loss function at each block performs almost as well as end-to-end backpropagation on ImageNet: a linear probe trained on top of our blockwise pretrained model obtains a top-1 classification accuracy of 70.48%, only 1.1% below the accuracy of an end-to-end pretrained network (71.57% accuracy). We perform extensive experiments to understand the impact of different components within our method and explore a variety of adaptations of self-supervised learning to the blockwise paradigm, building an exhaustive understanding of the critical avenues for scaling local learning rules to large networks, with implications ranging from hardware design to neuroscience.

View on arXiv PDF Code

Similar