CVJul 8, 2024

Tile Compression and Embeddings for Multi-Label Classification in GeoLifeCLEF 2024

arXiv:2407.06326v11 citationsh-index: 4Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses species prediction for ecological monitoring, but it is incremental as it applies existing methods like DCT and LSH to a competition dataset.

The paper tackled the multi-label classification task for predicting plant species presence using remote sensing data in the GeoLifeCLEF 2024 competition, achieving a leaderboard score of 0.152 and a post-competition score of 0.161.

We explore methods to solve the multi-label classification task posed by the GeoLifeCLEF 2024 competition with the DS@GT team, which aims to predict the presence and absence of plant species at specific locations using spatial and temporal remote sensing data. Our approach uses frequency-domain coefficients via the Discrete Cosine Transform (DCT) to compress and pre-compute the raw input data for convolutional neural networks. We also investigate nearest neighborhood models via locality-sensitive hashing (LSH) for prediction and to aid in the self-supervised contrastive learning of embeddings through tile2vec. Our best competition model utilized geolocation features with a leaderboard score of 0.152 and a best post-competition score of 0.161. Source code and models are available at https://github.com/dsgt-kaggle-clef/geolifeclef-2024.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes