Tobias Nauen

h-index13
2papers

2 Papers

47.9LGApr 30
Hyperspherical Forward-Forward with Prototypical Representations

Shalini Sarode, Brian Moser, Joachim Folz et al.

The Forward-Forward (FF) algorithm presents a compelling, bio-inspired alternative to backpropagation. However, while efficient in training, it has a computationally prohibitive inference process that requires a separate forward pass for every class that is evaluated. In this work, we introduce the Hyperspherical Forward-Forward (HFF), a novel reformulation that resolves this critical bottleneck. Our core innovation is to reframe the local objective of each layer from a binary goodness-of-fit task to a direct multi-class classification problem within a hyperspherical feature space. We achieve this by learning a set of class-specific, unit-norm prototypes that act as geometric anchors and implicit negatives. This architectural innovation preserves the benefits of local training while enabling weight update and inference in a single forward pass, making it >40x faster than the original FF algorithm. Our method is simple to implement, scales effectively to modern convolutional architectures, and achieves superior accuracy on standard image classification benchmarks, closing the gap with backpropagation. Most notably, we are among the first greedy local-learning methods to report over 25% top-1 accuracy on ImageNet-1k, and 65.96% with transfer learning.

CVFeb 5, 2025
A Study in Dataset Distillation for Image Super-Resolution

Tobias Dietz, Brian B. Moser, Tobias Nauen et al.

Dataset distillation aims to compress large datasets into compact yet highly informative subsets that preserve the training behavior of the original data. While this concept has gained traction in classification, its potential for image Super-Resolution (SR) remains largely untapped. In this work, we conduct the first systematic study of dataset distillation for SR, evaluating both pixel- and latent-space formulations. We show that a distilled dataset, occupying only 8.88% of the original size, can train SR models that retain nearly the same reconstruction fidelity as those trained on full datasets. Furthermore, we analyze how initialization strategies and distillation objectives affect efficiency, convergence, and visual quality. Our findings highlight the feasibility of SR dataset distillation and establish foundational insights for memory- and compute-efficient generative restoration models.