CV LGFeb 1, 2025

Contrastive Forward-Forward: A Training Algorithm of Vision Transformer

arXiv:2502.00571v15 citationsh-index: 2Neural Networks

Originality Incremental advance

AI Analysis

This work addresses the need for brain-inspired training algorithms in deep learning, offering incremental improvements for vision tasks.

The authors tackled the performance gap of the Forward-Forward training algorithm by extending it to Vision Transformers and revising it with contrastive learning insights, resulting in up to 10% higher accuracy and 5-20 times faster convergence compared to baseline Forward-Forward, and reducing the gap to backpropagation.

Although backpropagation is widely accepted as a training algorithm for artificial neural networks, researchers are always looking for inspiration from the brain to find ways with potentially better performance. Forward-Forward is a new training algorithm that is more similar to what occurs in the brain, although there is a significant performance gap compared to backpropagation. In the Forward-Forward algorithm, the loss functions are placed after each layer, and the updating of a layer is done using two local forward passes and one local backward pass. Forward-Forward is in its early stages and has been designed and evaluated on simple multi-layer perceptron networks to solve image classification tasks. In this work, we have extended the use of this algorithm to a more complex and modern network, namely the Vision Transformer. Inspired by insights from contrastive learning, we have attempted to revise this algorithm, leading to the introduction of Contrastive Forward-Forward. Experimental results show that our proposed algorithm performs significantly better than the baseline Forward-Forward leading to an increase of up to 10% in accuracy and boosting the convergence speed by 5 to 20 times on Vision Transformer. Furthermore, if we take Cross Entropy as the baseline loss function in backpropagation, it will be demonstrated that the proposed modifications to the baseline Forward-Forward reduce its performance gap compared to backpropagation on Vision Transformer, and even outperforms it in certain conditions, such as inaccurate supervision.

View on arXiv PDF

Similar