IMHESRLGOct 14, 2025

Simulation-Based Pretraining and Domain Adaptation for Astronomical Time Series with Minimal Labeled Data

arXiv:2510.12958v1h-index: 19
Originality Incremental advance
AI Analysis

This provides a practical solution for building general models in astronomy when labeled data is scarce, but it is incremental as it adapts existing pretraining and domain adaptation techniques to this domain.

The paper tackles the scarcity of labeled observational data in astronomical time-series analysis by developing a simulation-based pretraining approach that reduces the need for labeled real data, achieving substantial performance improvements in classification, redshift estimation, and anomaly detection with minimal fine-tuning, including effective zero-shot transfer across telescopes and phenomena.

Astronomical time-series analysis faces a critical limitation: the scarcity of labeled observational data. We present a pre-training approach that leverages simulations, significantly reducing the need for labeled examples from real observations. Our models, trained on simulated data from multiple astronomical surveys (ZTF and LSST), learn generalizable representations that transfer effectively to downstream tasks. Using classifier-based architectures enhanced with contrastive and adversarial objectives, we create domain-agnostic models that demonstrate substantial performance improvements over baseline methods in classification, redshift estimation, and anomaly detection when fine-tuned with minimal real data. Remarkably, our models exhibit effective zero-shot transfer capabilities, achieving comparable performance on future telescope (LSST) simulations when trained solely on existing telescope (ZTF) data. Furthermore, they generalize to very different astronomical phenomena (namely variable stars from NASA's \textit{Kepler} telescope) despite being trained on transient events, demonstrating cross-domain capabilities. Our approach provides a practical solution for building general models when labeled data is scarce, but domain knowledge can be encoded in simulations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes