LG CVMay 5

Covariance-Aware Goodness for Scalable Forward-Forward Learning

Xiaoyi Jiang, Bashir M. Al-Hashimi, Kai Xu

arXiv:2605.0434639.4h-index: 2

Predicted impact top 66% in LG · last 90 daysOriginality Highly original

AI Analysis

For researchers in biologically plausible or memory-efficient learning, this work significantly improves FF learning performance on complex benchmarks, though it remains incremental as it builds on existing FF methods.

The paper addresses the performance gap of Forward-Forward (FF) learning in convolutional networks, proposing a covariance-aware goodness framework that achieves 73.01% on ImageNet-100 and 50.30% on Tiny-ImageNet, and with hybrid blocks narrows the gap to 3.6% on ImageNet-100 while matching backpropagation on Tiny-ImageNet with ~50% memory reduction.

The Forward-Forward algorithm eliminates global gradient flow and full network activations storage. However, in convolutional settings, existing BP-free FF methods significantly under-perform backpropagation on complex benchmarks such as ImageNet-100 and Tiny-ImageNet. We identify this gap as a structural bottleneck in goodness extraction: standard sum-of-squares formulation collapses feature volumes into channel-wise activation energies which omits critical second-order dependencies. To address this, we propose a framework centered on three key components. First, Bi-axis Covariance Goodness(BiCovG) explicitly augments the standard goodness function with structured second-order information along two axes: cross-channel projections that model inter-feature covariance, and nested multi-scale aggregation that encodes spatial correlation statistics. This provides a tractable approximation to covariance-aware goodness without the prohibitive O(C^2) complexity of explicit matrix estimation. Second, a lightweight Logistic Fusion module aggregates layer-wise predictions, amplifying the contribution of deeper representations. Third, the Feature Alignment Layer(FAL) introduces a zero-initialized correction at block boundaries to mitigate representation misalignment in deep locally trained networks. By introducing these three components, we effectively double the depth of viable Forward-Forward learning, extending robust layer utilization from shallow baselines to 16 layer architectures like VGG-16. The resulting BP-free model achieves 73.01% on ImageNet-100 and 50.30% on Tiny-ImageNet. As a practical extension, Hybrid Goodness Blocks control the scope of gradient propagation via configurable block sizes, further narrowing the ImageNet-100 gap to 3.6% and matching BP on Tiny-ImageNet, while still reducing peak memory by approximately 50% relative to BP.

View on arXiv PDF

Similar