CVSep 23, 2025

Overview of PlantCLEF 2021: cross-domain plant identification

arXiv:2509.18697v110.224 citationsh-index: 43CLEF

Originality Synthesis-oriented

AI Analysis

This addresses the problem of limited field photo data for plant species identification in tropical areas, which is incremental by leveraging existing herbarium records.

The PlantCLEF 2021 challenge tackled automated plant identification in biodiversity-rich but data-poor tropical regions by using herbarium collections to train models, achieving cross-domain classification with a dataset of about 1,000 species from the Guiana Shield.

Automated plant identification has improved considerably thanks to recent advances in deep learning and the availability of training data with more and more field photos. However, this profusion of data concerns only a few tens of thousands of species, mainly located in North America and Western Europe, much less in the richest regions in terms of biodiversity such as tropical countries. On the other hand, for several centuries, botanists have systematically collected, catalogued and stored plant specimens in herbaria, especially in tropical regions, and recent efforts by the biodiversity informatics community have made it possible to put millions of digitised records online. The LifeCLEF 2021 plant identification challenge (or "PlantCLEF 2021") was designed to assess the extent to which automated identification of flora in data-poor regions can be improved by using herbarium collections. It is based on a dataset of about 1,000 species mainly focused on the Guiana Shield of South America, a region known to have one of the highest plant diversities in the world. The challenge was evaluated as a cross-domain classification task where the training set consisted of several hundred thousand herbarium sheets and a few thousand photos to allow learning a correspondence between the two domains. In addition to the usual metadata (location, date, author, taxonomy), the training data also includes the values of 5 morphological and functional traits for each species. The test set consisted exclusively of photos taken in the field. This article presents the resources and evaluations of the assessment carried out, summarises the approaches and systems used by the participating research groups and provides an analysis of the main results.

View on arXiv PDF

Similar