CVSep 6, 2022
Progressive Domain Adaptation with Contrastive Learning for Object Detection in the Satellite ImageryDebojyoti Biswas, Jelena Tešić
State-of-the-art object detection methods applied to satellite and drone imagery largely fail to identify small and dense objects. One reason is the high variability of content in the overhead imagery due to the terrestrial region captured and the high variability of acquisition conditions. Another reason is that the number and size of objects in aerial imagery are very different than in the consumer data. In this work, we propose a small object detection pipeline that improves the feature extraction process by spatial pyramid pooling, cross-stage partial networks, heatmap-based region proposal network, and object localization and identification through a novel image difficulty score that adapts the overall focal loss measure based on the image difficulty. Next, we propose novel contrastive learning with progressive domain adaptation to produce domain-invariant features across aerial datasets using local and global components. We show we can alleviate the degradation of object identification in previously unseen datasets. We create a first-ever domain adaptation benchmark using contrastive learning for the object detection task in highly imbalanced satellite datasets with significant domain gaps and dominant small objects. The proposed method results in a 7.4% increase in mAP performance measure over the best state-of-art.
CVFeb 18, 2025
Frequency-Aware Vision Transformers for High-Fidelity Super-Resolution of Earth System ModelsEhsan Zeraatkar, Salah A Faroughi, Jelena Tešić
Super-resolution (SR) is crucial for enhancing the spatial fidelity of Earth System Model (ESM) outputs, allowing fine-scale structures vital to climate science to be recovered from coarse simulations. However, traditional deep super-resolution methods, including convolutional and transformer-based models, tend to exhibit spectral bias, reconstructing low-frequency content more readily than valuable high-frequency details. In this work, we introduce two frequency-aware frameworks: the Vision Transformer-Tuned Sinusoidal Implicit Representation (ViSIR), combining Vision Transformers and sinusoidal activations to mitigate spectral bias, and the Vision Transformer Fourier Representation Network (ViFOR), which integrates explicit Fourier-based filtering for independent low- and high-frequency learning. Evaluated on the E3SM-HR Earth system dataset across surface temperature, shortwave, and longwave fluxes, these models outperform leading CNN, GAN, and vanilla transformer baselines, with ViFOR demonstrating up to 2.6~dB improvements in PSNR and significantly higher SSIM. Detailed ablation and scaling studies highlight the benefit of full-field training, the impact of frequency hyperparameters, and the potential for generalization. The results establish ViFOR as a state-of-the-art, scalable solution for climate data downscaling. Future extensions will address temporal super-resolution, multimodal climate variables, automated parameter selection, and integration of physical conservation constraints to broaden scientific applicability.
CVFeb 10, 2025
ViSIR: Vision Transformer Single Image Reconstruction Method for Earth System ModelsEhsan Zeraatkar, Salah Faroughi, Jelena Tešić
Purpose: Earth system models (ESMs) integrate the interactions of the atmosphere, ocean, land, ice, and biosphere to estimate the state of regional and global climate under a wide variety of conditions. The ESMs are highly complex; thus, deep neural network architectures are used to model the complexity and store the down-sampled data. This paper proposes the Vision Transformer Sinusoidal Representation Networks (ViSIR) to improve the ESM data's single image SR (SR) reconstruction task. Methods: ViSIR combines the SR capability of Vision Transformers (ViT) with the high-frequency detail preservation of the Sinusoidal Representation Network (SIREN) to address the spectral bias observed in SR tasks. Results: The ViSIR outperforms SRCNN by 2.16 db, ViT by 6.29 dB, SIREN by 8.34 dB, and SR-Generative Adversarial (SRGANs) by 7.93 dB PSNR on average for three different measurements. Conclusion: The proposed ViSIR is evaluated and compared with state-of-the-art methods. The results show that the proposed algorithm is outperforming other methods in terms of Mean Square Error(MSE), Peak-Signal-to-Noise-Ratio(PSNR), and Structural Similarity Index Measure(SSIM).
LGJan 1, 2025
KAN KAN Buff Signed Graph Neural Networks?Muhieddine Shebaro, Jelena Tešić
Graph Representation Learning aims to create effective embeddings for nodes and edges that encapsulate their features and relationships. Graph Neural Networks (GNNs) leverage neural networks to model complex graph structures. Recently, the Kolmogorov-Arnold Neural Network (KAN) has emerged as a promising alternative to the traditional Multilayer Perceptron (MLP), offering improved accuracy and interpretability with fewer parameters. In this paper, we propose the integration of KANs into Signed Graph Convolutional Networks (SGCNs), leading to the development of KAN-enhanced SGCNs (KASGCN). We evaluate KASGCN on tasks such as signed community detection and link sign prediction to improve embedding quality in signed networks. Our experimental results indicate that KASGCN exhibits competitive or comparable performance to standard SGCNs across the tasks evaluated, with performance variability depending on the specific characteristics of the signed graph and the choice of parameter settings. These findings suggest that KASGCNs hold promise for enhancing signed graph analysis with context-dependent effectiveness.
SISep 16, 2020
Characterizing Attitudinal Network Graphs through Frustration CloudLucas Rusnak, Jelena Tešić
Attitudinal Network Graphs are signed graphs where edges capture an expressed opinion; two vertices connected by an edge can be agreeable (positive) or antagonistic (negative). A signed graph is called balanced if each of its cycles includes an even number of negative edges. Balance is often characterized by the frustration index or by finding a single convergent balanced state of network consensus. In this paper, we propose to expand the measures of consensus from a single balanced state associated with the frustration index to the set of nearest balanced states. We introduce the frustration cloud as a set of all nearest balanced states and use a graph-balancing algorithm to find all nearest balanced states in a deterministic way. Computational concerns are addressed by measuring consensus probabilistically, and we introduce new vertex and edge metrics to quantify status, agreement, and influence. We also introduce a new global measure of controversy for a given signed graph and show that vertex status is a zero-sum game in the signed network. We propose an efficient scalable algorithm for calculating frustration cloud-based measures in social network and survey data of up to 80,000 vertices and half-a-million edges. We also demonstrate the power of the proposed approach to provide discriminant features for community discovery when compared to spectral clustering and to automatically identify dominant vertices and anomalous decisions in the network.