LGIMAINov 14, 2025

Intrinsic Dimension Estimation for Radio Galaxy Zoo using Diffusion Models

arXiv:2511.11490v1h-index: 5
Originality Synthesis-oriented
AI Analysis

This work provides insights into the complexity of radio galaxy data for astronomers, but it is incremental as it applies an existing method to a new dataset.

The study estimated the intrinsic dimension of the Radio Galaxy Zoo dataset using a diffusion model, finding that out-of-distribution sources have higher intrinsic dimension values and that the overall intrinsic dimension exceeds those of natural image datasets, with a weak trend toward higher signal-to-noise ratio at lower intrinsic dimension.

In this work, we estimate the intrinsic dimension (iD) of the Radio Galaxy Zoo (RGZ) dataset using a score-based diffusion model. We examine how the iD estimates vary as a function of Bayesian neural network (BNN) energy scores, which measure how similar the radio sources are to the MiraBest subset of the RGZ dataset. We find that out-of-distribution sources exhibit higher iD values, and that the overall iD for RGZ exceeds those typically reported for natural image datasets. Furthermore, we analyse how iD varies across Fanaroff-Riley (FR) morphological classes and as a function of the signal-to-noise ratio (SNR). While no relationship is found between FR I and FR II classes, a weak trend toward higher SNR at lower iD. Future work using the RGZ dataset could make use of the relationship between iD and energy scores to quantitatively study and improve the representations learned by various self-supervised learning algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes