Exploring the Potential of Latent Embeddings for Sea Ice Characterization using ICESat-2 Data
This work addresses the challenge of label scarcity for researchers in remote sensing and climate science, but it is incremental as it applies existing unsupervised methods to a specific domain.
This study tackled the problem of heavy reliance on manually collected labels for sea ice characterization using ICESat-2 data by exploring unsupervised autoencoders to derive latent embeddings, resulting in embeddings that preserve structure and generate more compact clusters, potentially reducing the number of required label samples.
The Ice, Cloud, and Elevation Satellite-2 (ICESat-2) provides high-resolution measurements of sea ice height. Recent studies have developed machine learning methods on ICESat-2 data, primarily focusing on surface type classification. However, the heavy reliance on manually collected labels requires significant time and effort for supervised learning, as it involves cross-referencing track measurements with overlapping background optical imagery. Additionally, the coincidence of ICESat-2 tracks with background images is relatively rare due to the different overpass patterns and atmospheric conditions. To address these limitations, this study explores the potential of unsupervised autoencoder on unlabeled data to derive latent embeddings. We develop autoencoder models based on Long Short-Term Memory (LSTM) and Convolutional Neural Networks (CNN) to reconstruct topographic sequences from ICESat-2 and derive embeddings. We then apply Uniform Manifold Approximation and Projection (UMAP) to reduce dimensions and visualize the embeddings. Our results show that embeddings from autoencoders preserve the overall structure but generate relatively more compact clusters compared to the original ICESat-2 data, indicating the potential of embeddings to lessen the number of required labels samples.