Anna K. Liljedahl

CV
h-index24
3papers
7citations
Novelty57%
AI Score38

3 Papers

51.8CVMay 28
Clustering Guided Domain-Specific Pretrained Foundation Model Very High-Resolution Arctic Remote Sensing

Amal S. Perera, Chandi Witharana, Elias Manos et al.

This study introduces a novel Arctic-focused remote sensing foundation model (RSFM) by combining diversity-aware regional-scale image curation with masked autoencoder (MAE) self-supervised pretraining of a Vision Transformer (ViT) encoder for very-high-spatial-resolution (VHSR) satellite image analysis. Spectral and acquisition-metadata descriptors were used in a scalable affinity-propagation clustering workflow to select approximately 3 million chips from 267 TB of Vantor VHSR imagery This curation strategy was designed to reduce oversampling of visually repetitive or low-information areas while preserving broad scene diversity across the study domain. We pretrained a ViT-Large encoder on the curated corpus using a domain-adapted MAE reconstruction objective, producing Arctic-specific transformer weights for downstream feature mapping. The pretrained encoder was integrated into an existing location-aware detection and segmentation framework and evaluated across four hand-labeled Arctic datasets. Compared to ImageNet-initialized ViT-Large baseline, Arctic MAE pretraining produced consistent improvements in foreground mean F1 scores of 0.87, 0.72, 0.93, and 0.87, for infrastructure, IWP, RTS, and TCNs, with approximately 5-8 percentage increase. The proposed model also outperformed Prithvi-EO-2.0 in all downstream comparisons, with the smallest gain corresponding to at least a 15 percentage improvement mean F1, suggesting that domain-specific self-supervised pretraining on curated Arctic VHSR imagery provides more transferable representations for fine-scale Arctic mapping than a general-purpose Earth observation foundation model. These results demonstrate that optimizing the pretraining data distribution at regional scale, while keeping the architecture and MAE objective fixed, can produce a reusable Arctic-domain encoder for multiple VHSR remote sensing applications.

IRNov 19, 2024
Advancing Large Language Models for Spatiotemporal and Semantic Association Mining of Similar Environmental Events

Yuanyuan Tian, Wenwen Li, Lei Hu et al.

Retrieval and recommendation are two essential tasks in modern search tools. This paper introduces a novel retrieval-reranking framework leveraging Large Language Models (LLMs) to enhance the spatiotemporal and semantic associated mining and recommendation of relevant unusual climate and environmental events described in news articles and web posts. This framework uses advanced natural language processing techniques to address the limitations of traditional manual curation methods in terms of high labor cost and lack of scalability. Specifically, we explore an optimized solution to employ cutting-edge embedding models for semantically analyzing spatiotemporal events (news) and propose a Geo-Time Re-ranking (GT-R) strategy that integrates multi-faceted criteria including spatial proximity, temporal association, semantic similarity, and category-instructed similarity to rank and identify similar spatiotemporal events. We apply the proposed framework to a dataset of four thousand Local Environmental Observer (LEO) Network events, achieving top performance in recommending similar events among multiple cutting-edge dense retrieval models. The search and recommendation pipeline can be applied to a wide range of similar data search tasks dealing with geospatial and temporal data. We hope that by linking relevant events, we can better aid the general public to gain an enhanced understanding of climate change and its impact on different communities.

CVJun 3, 2025
Pan-Arctic Permafrost Landform and Human-built Infrastructure Feature Detection with Vision Transformers and Location Embeddings

Amal S. Perera, David Fernandez, Chandi Witharana et al.

Accurate mapping of permafrost landforms, thaw disturbances, and human-built infrastructure at pan-Arctic scale using sub-meter satellite imagery is increasingly critical. Handling petabyte-scale image data requires high-performance computing and robust feature detection models. While convolutional neural network (CNN)-based deep learning approaches are widely used for remote sensing (RS),similar to the success in transformer based large language models, Vision Transformers (ViTs) offer advantages in capturing long-range dependencies and global context via attention mechanisms. ViTs support pretraining via self-supervised learning-addressing the common limitation of labeled data in Arctic feature detection and outperform CNNs on benchmark datasets. Arctic also poses challenges for model generalization, especially when features with the same semantic class exhibit diverse spectral characteristics. To address these issues for Arctic feature detection, we integrate geospatial location embeddings into ViTs to improve adaptation across regions. This work investigates: (1) the suitability of pre-trained ViTs as feature extractors for high-resolution Arctic remote sensing tasks, and (2) the benefit of combining image and location embeddings. Using previously published datasets for Arctic feature detection, we evaluate our models on three tasks-detecting ice-wedge polygons (IWP), retrogressive thaw slumps (RTS), and human-built infrastructure. We empirically explore multiple configurations to fuse image embeddings and location embeddings. Results show that ViTs with location embeddings outperform prior CNN-based models on two of the three tasks including F1 score increase from 0.84 to 0.92 for RTS detection, demonstrating the potential of transformer-based models with spatial awareness for Arctic RS applications.