CVGAAug 29, 2025

GraViT: Transfer Learning with Vision Transformers and MLP-Mixer for Strong Gravitational Lens Discovery

arXiv:2509.00226v11 citationsh-index: 13Mon not R Astron Soc
Originality Synthesis-oriented
AI Analysis

This work addresses the need for automated classifiers to handle the large number of gravitational lenses expected from LSST, but it is incremental as it builds on existing methods and datasets.

The authors tackled the problem of automated gravitational lens detection for the upcoming LSST survey by introducing GraViT, a pipeline using pretrained Vision Transformers and MLP-Mixer, and benchmarked it against convolutional baselines on datasets from HOLISMOKES VI and SuGOHI X.

Gravitational lensing offers a powerful probe into the properties of dark matter and is crucial to infer cosmological parameters. The Legacy Survey of Space and Time (LSST) is predicted to find O(10^5) gravitational lenses over the next decade, demanding automated classifiers. In this work, we introduce GraViT, a PyTorch pipeline for gravitational lens detection that leverages extensive pretraining of state-of-the-art Vision Transformer (ViT) models and MLP-Mixer. We assess the impact of transfer learning on classification performance by examining data quality (source and sample size), model architecture (selection and fine-tuning), training strategies (augmentation, normalization, and optimization), and ensemble predictions. This study reproduces the experiments in a previous systematic comparison of neural networks and provides insights into the detectability of strong gravitational lenses on that common test sample. We fine-tune ten architectures using datasets from HOLISMOKES VI and SuGOHI X, and benchmark them against convolutional baselines, discussing complexity and inference-time analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes