Ata Akbari Asanjan

LG
h-index36
7papers
167citations
Novelty46%
AI Score41

7 Papers

LGMar 22, 2023
Anomaly Detection in Aeronautics Data with Quantum-compatible Discrete Deep Generative Model

Thomas Templin, Milad Memarzadeh, Walter Vinci et al.

Deep generative learning cannot only be used for generating new data with statistical characteristics derived from input data but also for anomaly detection, by separating nominal and anomalous instances based on their reconstruction quality. In this paper, we explore the performance of three unsupervised deep generative models -- variational autoencoders (VAEs) with Gaussian, Bernoulli, and Boltzmann priors -- in detecting anomalies in flight-operations data of commercial flights consisting of multivariate time series. We devised two VAE models with discrete latent variables (DVAEs), one with a factorized Bernoulli prior and one with a restricted Boltzmann machine (RBM) as prior, because of the demand for discrete-variable models in machine-learning applications and because the integration of quantum devices based on two-level quantum systems requires such models. The DVAE with RBM prior, using a relatively simple -- and classically or quantum-mechanically enhanceable -- sampling technique for the evolution of the RBM's negative phase, performed better than the Bernoulli DVAE and on par with the Gaussian model, which has a continuous latent space. Our studies demonstrate the competitiveness of a discrete deep generative model with its Gaussian counterpart on anomaly-detection tasks. Moreover, the DVAE model with RBM prior can be easily integrated with quantum sampling by outsourcing its generative process to measurements of quantum states obtained from a quantum annealer or gate-model device.

LGJan 1
Sequential Reservoir Computing for Efficient High-Dimensional Spatiotemporal Forecasting

Ata Akbari Asanjan, Filip Wudarski, Daniel O'Connor et al.

Forecasting high-dimensional spatiotemporal systems remains computationally challenging for recurrent neural networks (RNNs) and long short-term memory (LSTM) models due to gradient-based training and memory bottlenecks. Reservoir Computing (RC) mitigates these challenges by replacing backpropagation with fixed recurrent layers and a convex readout optimization, yet conventional RC architectures still scale poorly with input dimensionality. We introduce a Sequential Reservoir Computing (Sequential RC) architecture that decomposes a large reservoir into a series of smaller, interconnected reservoirs. This design reduces memory and computational costs while preserving long-term temporal dependencies. Using both low-dimensional chaotic systems (Lorenz63) and high-dimensional physical simulations (2D vorticity and shallow-water equations), Sequential RC achieves 15-25% longer valid forecast horizons, 20-30% lower error metrics (SSIM, RMSE), and up to three orders of magnitude lower training cost compared to LSTM and standard RNN baselines. The results demonstrate that Sequential RC maintains the simplicity and efficiency of conventional RC while achieving superior scalability for high-dimensional dynamical systems. This approach provides a practical path toward real-time, energy-efficient forecasting in scientific and engineering applications.

LGMay 15, 2025Code
GAIA: A Foundation Model for Operational Atmospheric Dynamics

Ata Akbari Asanjan, Olivia Alexander, Tom Berg et al.

We introduce GAIA (Geospatial Artificial Intelligence for Atmospheres), a hybrid self-supervised geospatial foundation model that fuses Masked Autoencoders (MAE) with self-distillation with no labels (DINO) to generate semantically rich representations from global geostationary satellite imagery. Pre-trained on 15 years of globally-merged infrared observations (2001-2015), GAIA learns disentangled representations that capture atmospheric dynamics rather than trivial diurnal patterns, as evidenced by distributed principal component structure and temporal coherence analysis. We demonstrate robust reconstruction capabilities across varying data availability (30-95% masking), achieving superior gap-filling performance on real missing data patterns. When transferred to downstream tasks, GAIA consistently outperforms an MAE-only baseline: improving atmospheric river segmentation (F1: 0.58 vs 0.52), enhancing tropical cyclone detection (storm-level recall: 81% vs 75%, early detection: 29% vs 17%), and maintaining competitive precipitation estimation performance. Analysis reveals that GAIA's hybrid objectives encourage learning of spatially coherent, object-centric features distributed across multiple principal components rather than concentrated representations focused on reconstruction. This work demonstrates that combining complementary self-supervised objectives yields more transferable representations for diverse atmospheric modeling tasks. Model weights and code are available at: https://huggingface.co/bcg-usra-nasa-gaia/GAIA-v1.

CVDec 3, 2024
Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications

Daniela Szwarcman, Sujit Roy, Paolo Fraccaro et al.

This technical report presents Prithvi-EO-2.0, a new geospatial foundation model that offers significant improvements over its predecessor, Prithvi-EO-1.0. Trained on 4.2M global time series samples from NASA's Harmonized Landsat and Sentinel-2 data archive at 30m resolution, the new 300M and 600M parameter models incorporate temporal and location embeddings for enhanced performance across various geospatial tasks. Through extensive benchmarking with GEO-Bench, the 600M version outperforms the previous Prithvi-EO model by 8\% across a range of tasks. It also outperforms six other geospatial foundation models when benchmarked on remote sensing tasks from different domains and resolutions (i.e. from 0.1m to 15m). The results demonstrate the versatility of the model in both classical earth observation and high-resolution applications. Early involvement of end-users and subject matter experts (SMEs) are among the key factors that contributed to the project's success. In particular, SME involvement allowed for constant feedback on model and dataset design, as well as successful customization for diverse SME-led applications in disaster response, land use and crop mapping, and ecosystem dynamics monitoring. Prithvi-EO-2.0 is available on Hugging Face and IBM terratorch, with additional resources on GitHub. The project exemplifies the Trusted Open Science approach embraced by all involved organizations.

SRAug 18, 2025
Surya: Foundation Model for Heliophysics

Sujit Roy, Johannes Schmude, Rohit Lal et al.

Heliophysics is central to understanding and forecasting space weather events and solar activity. Despite decades of high-resolution observations from the Solar Dynamics Observatory (SDO), most models remain task-specific and constrained by scarce labeled data, limiting their capacity to generalize across solar phenomena. We introduce Surya, a 366M parameter foundation model for heliophysics designed to learn general-purpose solar representations from multi-instrument SDO observations, including eight Atmospheric Imaging Assembly (AIA) channels and five Helioseismic and Magnetic Imager (HMI) products. Surya employs a spatiotemporal transformer architecture with spectral gating and long--short range attention, pretrained on high-resolution solar image forecasting tasks and further optimized through autoregressive rollout tuning. Zero-shot evaluations demonstrate its ability to forecast solar dynamics and flare events, while downstream fine-tuning with parameter-efficient Low-Rank Adaptation (LoRA) shows strong performance on solar wind forecasting, active region segmentation, solar flare forecasting, and EUV spectra. Surya is the first foundation model in heliophysics that uses time advancement as a pretext task on full-resolution SDO data. Its novel architecture and performance suggest that the model is able to learn the underlying physics behind solar evolution.

AIJun 19, 2024
Combinatorial Reasoning: Selecting Reasons in Generative AI Pipelines via Combinatorial Optimization

Mert Esencan, Tarun Advaith Kumar, Ata Akbari Asanjan et al.

Recent Large Language Models (LLMs) have demonstrated impressive capabilities at tasks that require human intelligence and are a significant step towards human-like artificial intelligence (AI). Yet the performance of LLMs at reasoning tasks have been subpar and the reasoning capability of LLMs is a matter of significant debate. While it has been shown that the choice of the prompting technique to the LLM can alter its performance on a multitude of tasks, including reasoning, the best performing techniques require human-made prompts with the knowledge of the tasks at hand. We introduce a framework for what we call Combinatorial Reasoning (CR), a fully-automated prompting method, where reasons are sampled from an LLM pipeline and mapped into a Quadratic Unconstrained Binary Optimization (QUBO) problem. The framework investigates whether QUBO solutions can be profitably used to select a useful subset of the reasons to construct a Chain-of-Thought style prompt. We explore the acceleration of CR with specialized solvers. We also investigate the performance of simpler zero-shot strategies such as linear majority rule or random selection of reasons. Our preliminary study indicates that coupling a combinatorial solver to generative AI pipelines is an interesting avenue for AI reasoning and elucidates design principles for future CR methods.

CVFeb 17, 2022
Deep Transfer Learning on Satellite Imagery Improves Air Quality Estimates in Developing Nations

Nishant Yadav, Meytar Sorek-Hamer, Michael Von Pohle et al.

Urban air pollution is a public health challenge in low- and middle-income countries (LMICs). However, LMICs lack adequate air quality (AQ) monitoring infrastructure. A persistent challenge has been our inability to estimate AQ accurately in LMIC cities, which hinders emergency preparedness and risk mitigation. Deep learning-based models that map satellite imagery to AQ can be built for high-income countries (HICs) with adequate ground data. Here we demonstrate that a scalable approach that adapts deep transfer learning on satellite imagery for AQ can extract meaningful estimates and insights in LMIC cities based on spatiotemporal patterns learned in HIC cities. The approach is demonstrated for Accra in Ghana, Africa, with AQ patterns learned from two US cities, specifically Los Angeles and New York.