LD-SDM: Language-Driven Hierarchical Species Distribution Modeling
This addresses the problem of mapping species ranges for ecologists and conservationists, offering a novel integration of taxonomic information but remaining incremental in its core modeling approach.
The paper tackles species distribution modeling by integrating taxonomic classification through a language model, enabling range prediction for any taxonomic rank including unseen species without additional supervision. The proposed model outperforms state-of-the-art methods in species range prediction, zero-shot prediction, and geo-feature regression.
We focus on species distribution modeling using global-scale presence-only data, leveraging geographical and environmental features to map species ranges, as in previous studies. However, we innovate by integrating taxonomic classification into our approach. Specifically, we propose using a large language model to extract a latent representation of the taxonomic classification from a textual prompt. This allows us to map the range of any taxonomic rank, including unseen species, without additional supervision. We also present a new proximity-aware evaluation metric, suitable for evaluating species distribution models, which addresses critical shortcomings of traditional metrics. We evaluated our model for species range prediction, zero-shot prediction, and geo-feature regression and found that it outperforms several state-of-the-art models.