CVMar 23, 2025

End-to-End Implicit Neural Representations for Classification

arXiv:2503.18123v16 citationsh-index: 6Has CodeCVPR

Originality Incremental advance

AI Analysis

This work addresses the challenge of applying INRs to classification tasks for machine learning researchers, offering incremental improvements over existing methods.

The paper tackles the problem of using implicit neural representations (INRs) for classification, which underperforms compared to pixel-based methods, by proposing an end-to-end strategy with learned learning-rate schemes to improve accuracy. It achieves state-of-the-art results, such as increasing CIFAR-10 SIREN classification from 38.8% to 59.6% without augmentations and setting baselines on high-resolution datasets like ImageNet-1K with 23.6% accuracy.

Implicit neural representations (INRs) such as NeRF and SIREN encode a signal in neural network parameters and show excellent results for signal reconstruction. Using INRs for downstream tasks, such as classification, is however not straightforward. Inherent symmetries in the parameters pose challenges and current works primarily focus on designing architectures that are equivariant to these symmetries. However, INR-based classification still significantly under-performs compared to pixel-based methods like CNNs. This work presents an end-to-end strategy for initializing SIRENs together with a learned learning-rate scheme, to yield representations that improve classification accuracy. We show that a simple, straightforward, Transformer model applied to a meta-learned SIREN, without incorporating explicit symmetry equivariances, outperforms the current state-of-the-art. On the CIFAR-10 SIREN classification task, we improve the state-of-the-art without augmentations from 38.8% to 59.6%, and from 63.4% to 64.7% with augmentations. We demonstrate scalability on the high-resolution Imagenette dataset achieving reasonable reconstruction quality with a classification accuracy of 60.8% and are the first to do INR classification on the full ImageNet-1K dataset where we achieve a SIREN classification performance of 23.6%. To the best of our knowledge, no other SIREN classification approach has managed to set a classification baseline for any high-resolution image dataset. Our code is available at https://github.com/SanderGielisse/MWT

View on arXiv PDF Code

Similar