Zhongying Wang

h-index5
2papers

2 Papers

10.2CVApr 20
A Proxy Consistency Loss for Grounded Fusion of Earth Observation and Location Encoders

Zhongying Wang, Kevin Lane, Levi Cai et al.

Supervised learning with Earth observation inputs is often limited by the sparsity of high-quality labeled or in-situ measured data to use as training labels. With the abundance of geographic data products, in many cases there are variables correlated with - but different from - the variable of interest that can be leveraged. We integrate such proxy variables within a geographic prior via a trainable location encoder and introduce a proxy consistency loss (PCL) formulation to imbue proxy data into the location encoder. The first key insight behind our approach is to use the location encoder as an agile and flexible way to learn from abundantly available proxy data which can be sampled independently of training label availability. Our second key insight is that we will need to regularize the location encoder appropriately to achieve performance and robustness with limited labeled data. Our experiments on air quality prediction and poverty mapping show that integrating proxy data implicitly through the location encoder outperforms using both as input to an observation encoder and fusion strategies that use frozen, pretrained location embeddings as a geographic prior. Superior performance for in-sample prediction shows that the PCL can incorporate rich information from the proxies, and superior out-of-sample prediction shows that the learned latent embeddings help generalize to areas without training labels.

LGMay 24, 2025
Performance and Generalizability Impacts of Incorporating Location Encoders into Deep Learning for Dynamic PM2.5 Estimation

Morteza Karimzadeh, Zhongying Wang, James L. Crooks

Deep learning has shown strong performance in geospatial prediction tasks, but the role of geolocation information in improving accuracy and generalizability remains underexamined. Recent work has introduced location encoders that aim to represent spatial context in a transferable way, yet most evaluations have focused on static mapping tasks. Here, we study the effect of incorporating geolocation into deep learning for a dynamic and spatially heterogeneous application: estimating daily surface-level PM2.5 across the contiguous United States using satellite and ground-based observations. We compare three strategies for representing location: excluding geolocation, using raw latitude and longitude, and using pretrained location encoders. We evaluate each under within-region and out-of-region generalization settings. Results show that raw coordinates can improve performance within regions by supporting spatial interpolation, but can reduce generalizability across regions. In contrast, pretrained location encoders such as GeoCLIP improve both predictive accuracy and geographic transfer. However, we also observe spatial artifacts linked to encoder characteristics, and performance varies across encoder types (e.g., SatCLIP vs. GeoCLIP). This work provides the first systematic evaluation of location encoders in a dynamic environmental estimation context and offers guidance for incorporating geolocation into deep learning models for geospatial prediction.