Lukas Picek

CV
h-index20
16papers
146citations
Novelty28%
AI Score48

16 Papers

CVNov 15, 2023Code
WildlifeDatasets: An open-source toolkit for animal re-identification

Vojtěch Čermák, Lukas Picek, Lukáš Adam et al.

In this paper, we present WildlifeDatasets (https://github.com/WildlifeDatasets/wildlife-datasets) - an open-source toolkit intended primarily for ecologists and computer-vision / machine-learning researchers. The WildlifeDatasets is written in Python, allows straightforward access to publicly available wildlife datasets, and provides a wide variety of methods for dataset pre-processing, performance analysis, and model fine-tuning. We showcase the toolkit in various scenarios and baseline experiments, including, to the best of our knowledge, the most comprehensive experimental comparison of datasets and methods for wildlife re-identification, including both local descriptors and deep learning approaches. Furthermore, we provide the first-ever foundation model for individual re-identification within a wide range of species - MegaDescriptor - that provides state-of-the-art performance on animal re-identification datasets and outperforms other pre-trained models such as CLIP and DINOv2 by a significant margin. To make the model available to the general public and to allow easy integration with any existing wildlife monitoring applications, we provide multiple MegaDescriptor flavors (i.e., Small, Medium, and Large) through the HuggingFace hub (https://huggingface.co/BVRA).

CVAug 24, 2024Code
FungiTastic: A multi-modal dataset and benchmark for image categorization

Lukas Picek, Klara Janouskova, Vojtech Cermak et al.

We introduce a new, challenging benchmark and a dataset, FungiTastic, based on fungal records continuously collected over a twenty-year span. The dataset is labelled and curated by experts and consists of about 350k multimodal observations of 6k fine-grained categories (species). The fungi observations include photographs and additional data, e.g., meteorological and climatic data, satellite images, and body part segmentation masks. FungiTastic is one of the few benchmarks that include a test set with DNA-sequenced ground truth of unprecedented label reliability. The benchmark is designed to support (i) standard closed-set classification, (ii) open-set classification, (iii) multi-modal classification, (iv) few-shot learning, (v) domain shift, and many more. We provide tailored baselines for many use cases, a multitude of ready-to-use pre-trained models on https://huggingface.co/collections/BVRA/fungitastic-66a227ce0520be533dc6403b, and a framework for model training. The documentation and the baselines are available at https://github.com/BohemianVRA/FungiTastic/ and https://www.kaggle.com/datasets/picekl/fungitastic.

LGSep 26, 2024Code
MALPOLON: A Framework for Deep Species Distribution Modeling

Theo Larcher, Lukas Picek, Benjamin Deneu et al.

This paper describes a deep-SDM framework, MALPOLON. Written in Python and built upon the PyTorch library, this framework aims to facilitate training and inferences of deep species distribution models (deep-SDM) and sharing for users with only general Python language skills (e.g., modeling ecologists) who are interested in testing deep learning approaches to build new SDMs. More advanced users can also benefit from the framework's modularity to run more specific experiments by overriding existing classes while taking advantage of press-button examples to train neural networks on multiple classification tasks using custom or provided raw and pre-processed datasets. The framework is open-sourced on GitHub and PyPi along with extensive documentation and examples of use in various scenarios. MALPOLON offers straightforward installation, YAML-based configuration, parallel computing, multi-GPU utilization, baseline and foundational models for benchmarking, and extensive tutorials/documentation, aiming to enhance accessibility and performance scalability for ecologists and researchers.

CVAug 25, 2024
GeoPlant: Spatial Plant Species Prediction Dataset

Lukas Picek, Christophe Botella, Maximilien Servajean et al.

The difficulty of monitoring biodiversity at fine scales and over large areas limits ecological knowledge and conservation efforts. To fill this gap, Species Distribution Models (SDMs) predict species across space from spatially explicit features. Yet, they face the challenge of integrating the rich but heterogeneous data made available over the past decade, notably millions of opportunistic species observations and standardized surveys, as well as multimodal remote sensing data. In light of that, we have designed and developed a new European-scale dataset for SDMs at high spatial resolution (10--50m), including more than 10k species (i.e., most of the European flora). The dataset comprises 5M heterogeneous Presence-Only records and 90k exhaustive Presence-Absence survey records, all accompanied by diverse environmental rasters (e.g., elevation, human footprint, and soil) traditionally used in SDMs. In addition, it provides Sentinel-2 RGB and NIR satellite images with 10 m resolution, a 20-year time series of climatic variables, and satellite time series from the Landsat program. In addition to the data, we provide an openly accessible SDM benchmark (hosted on Kaggle), which has already attracted an active community and a set of strong baselines for single predictor/modality and multimodal approaches. All resources, e.g., the dataset, pre-trained models, and baseline methods (in the form of notebooks), are available on Kaggle, allowing one to start with our dataset literally with two mouse clicks.

CVAug 23, 2024
WildFusion: Individual Animal Identification with Calibrated Similarity Fusion

Vojtěch Cermak, Lukas Picek, Lukáš Adam et al.

We propose a new method - WildFusion - for individual identification of a broad range of animal species. The method fuses deep scores (e.g., MegaDescriptor or DINOv2) and local matching similarity (e.g., LoFTR and LightGlue) to identify individual animals. The global and local information fusion is facilitated by similarity score calibration. In a zero-shot setting, relying on local similarity score only, WildFusion achieved mean accuracy, measured on 17 datasets, of 76.2%. This is better than the state-of-the-art model, MegaDescriptor-L, whose training set included 15 of the 17 datasets. If a dataset-specific calibration is applied, mean accuracy increases by 2.3% percentage points. WildFusion, with both local and global similarity scores, outperforms the state-of-the-art significantly - mean accuracy reached 84.0%, an increase of 8.5 percentage points; the mean relative error drops by 35%. We make the code and pre-trained models publicly available5, enabling immediate use in ecology and conservation.

CVAug 23, 2024
Animal Identification with Independent Foreground and Background Modeling

Lukas Picek, Lukas Neumann, Jiri Matas

We propose a method that robustly exploits background and foreground in visual identification of individual animals. Experiments show that their automatic separation, made easy with methods like Segment Anything, together with independent foreground and background-related modeling, improves results. The two predictions are combined in a principled way, thanks to novel Per-Instance Temperature Scaling that helps the classifier to deal with appearance ambiguities in training and to produce calibrated outputs in the inference phase. For identity prediction from the background, we propose novel spatial and temporal models. On two problems, the relative error w.r.t. the baseline was reduced by 22.3% and 8.8%, respectively. For cases where objects appear in new locations, an example of background drift, accuracy doubles.

49.7CYMay 15
To Trust or Not to Trust: Authors' Response to AI-based Reviews

César Leblanc, Lukas Picek

Large language models are increasingly discussed and used as tools that may assist with scholarly peer review, but empirical evidence regarding how authors use and perceive AI-based feedback remains limited. This paper reports findings from two independent pilot studies on authors' use and perceptions of AI-based auxiliary review at two computer science venues. After the review release, authors were invited to complete an anonymous post-review questionnaire about the AI review's usefulness, trustworthiness, agreement with human reviews, practical value for revision, perceived inaccuracies, and consent. The final dataset included 56 analyzable responses from authors of 40 papers; closed-ended items were summarized using descriptive statistics, and open-ended responses were analyzed using inductive thematic analysis. Most respondents (83.9%) considered the AI-based review useful, and 80.4% reported that it identified issues not mentioned by human reviewers. This perceived added value translated into action: 82.1% reported using at least some AI feedback in their camera-ready version. However, the authors did not treat the AI review as equivalent to a human review. They generally trusted it less than the human reviews and found human feedback clearer, even though 25.0% described at least some human reviews as not very useful. Reported problems with the AI review were usually limited: 51.8% reported minor inaccuracies, while 16.1% reported clearly incorrect, misleading, or irrelevant comments. Support for future use was strongest when AI was framed as a supervised or author-controlled tool: 96.4% said they would use AI as an internal review tool before future submissions, 89.3% preferred advance notice that AI would be used in review, and 76.8% favored explicit consent before use.

54.5CVApr 10
ACCIDENT: A Benchmark Dataset for Vehicle Accident Detection from Traffic Surveillance Videos

Lukas Picek, Michal Čermák, Marek Hanzl et al.

We introduce ACCIDENT, a benchmark dataset for traffic accident detection in CCTV footage, designed to evaluate models in supervised (IID and OOD) and zero-shot settings, reflecting both data-rich and data-scarce scenarios. The benchmark consists of a curated set of 2,027 real and 2,211 synthetic clips annotated with the accident time, spatial location, and high-level collision type. We define three core tasks: (i) temporal localization of the accident, (ii) its spatial localization, and (iii) collision type classification. Each task is evaluated using custom metrics that account for the uncertainty and ambiguity inherent in CCTV footage. In addition to the benchmark, we provide a diverse set of baselines, including heuristic, motion-aware, and vision-language approaches, and show that ACCIDENT is challenging. You can access the ACCIDENT at: https://accidentbench.github.io

73.9PEApr 22
Centering Ecological Goals in Automated Identification of Individual Animals

Lukas Picek, Timm Haucke, Lukáš Adam et al.

Recognizing individual animals over time is central to many ecological and conservation questions, including estimating abundance, survival, movement, and social structure. Recent advances in automated identification from images and even acoustic data suggest that this process could be greatly accelerated, yet their promise has not translated well into ecological practice. We argue that the main barrier is not the performance of the automated methods themselves, but a mismatch between how those methods are typically developed and evaluated, and how ecological data is actually collected, processed, reviewed, and used. Future progress, therefore, will depend less on algorithmic gains alone than on recognizing that the usefulness of automated identification is grounded in ecological context: it depends on what question is being asked, what data are available, and what kinds of mistakes matter. Only by centering these questions can we move toward automated identification of individuals that is not only accurate but also ecologically useful, transparent, and trustworthy.

CVOct 16, 2024
Towards Zero-Shot Camera Trap Image Categorization

Jiří Vyskočil, Lukas Picek

This paper describes the search for an alternative approach to the automatic categorization of camera trap images. First, we benchmark state-of-the-art classifiers using a single model for all images. Next, we evaluate methods combining MegaDetector with one or more classifiers and Segment Anything to assess their impact on reducing location-specific overfitting. Last, we propose and test two approaches using large language and foundational models, such as DINOv2, BioCLIP, BLIP, and ChatGPT, in a zero-shot scenario. Evaluation carried out on two publicly available datasets (WCT from New Zealand, CCT20 from the Southwestern US) and a private dataset (CEF from Central Europe) revealed that combining MegaDetector with two separate classifiers achieves the highest accuracy. This approach reduced the relative error of a single BEiTV2 classifier by approximately 42\% on CCT20, 48\% on CEF, and 75\% on WCT. Besides, as the background is removed, the error in terms of accuracy in new locations is reduced to half. The proposed zero-shot pipeline based on DINOv2 and FAISS achieved competitive results (1.0\% and 4.7\% smaller on CCT20, and CEF, respectively), which highlights the potential of zero-shot approaches for camera trap image categorization.

CVDec 27, 2024
Zero-shot Hazard Identification in Autonomous Driving: A Case Study on the COOOL Benchmark

Lukas Picek, Vojtěch Čermák, Marek Hanzl

This paper presents our submission to the COOOL competition, a novel benchmark for detecting and classifying out-of-label hazards in autonomous driving. Our approach integrates diverse methods across three core tasks: (i) driver reaction detection, (ii) hazard object identification, and (iii) hazard captioning. We propose kernel-based change point detection on bounding boxes and optical flow dynamics for driver reaction detection to analyze motion patterns. For hazard identification, we combined a naive proximity-based strategy with object classification using a pre-trained ViT model. At last, for hazard captioning, we used the MOLMO vision-language model with tailored prompts to generate precise and context-aware descriptions of rare and low-resolution hazards. The proposed pipeline outperformed the baseline methods by a large margin, reducing the relative error by 33%, and scored 2nd on the final leaderboard consisting of 32 teams.

AIApr 7, 2025
Mapping biodiversity at very-high resolution in Europe

César Leblanc, Lukas Picek, Benjamin Deneu et al.

This paper describes a cascading multimodal pipeline for high-resolution biodiversity mapping across Europe, integrating species distribution modeling, biodiversity indicators, and habitat classification. The proposed pipeline first predicts species compositions using a deep-SDM, a multimodal model trained on remote sensing, climate time series, and species occurrence data at 50x50m resolution. These predictions are then used to generate biodiversity indicator maps and classify habitats with Pl@ntBERT, a transformer-based LLM designed for species-to-habitat mapping. With this approach, continental-scale species distribution maps, biodiversity indicator maps, and habitat maps are produced, providing fine-grained ecological insights. Unlike traditional methods, this framework enables joint modeling of interspecies dependencies, bias-aware training with heterogeneous presence-absence data, and large-scale inference from multi-source remote sensing inputs.

QMNov 16, 2025
GeoPl@ntNet: A Platform for Exploring Essential Biodiversity Variables

Lukas Picek, César Leblanc, Alexis Joly et al.

This paper describes GeoPl@ntNet, an interactive web application designed to make Essential Biodiversity Variables accessible and understandable to everyone through dynamic maps and fact sheets. Its core purpose is to allow users to explore high-resolution AI-generated maps of species distributions, habitat types, and biodiversity indicators across Europe. These maps, developed through a cascading pipeline involving convolutional neural networks and large language models, provide an intuitive yet information-rich interface to better understand biodiversity, with resolutions as precise as 50x50 meters. The website also enables exploration of specific regions, allowing users to select areas of interest on the map (e.g., urban green spaces, protected areas, or riverbanks) to view local species and their coverage. Additionally, GeoPl@ntNet generates comprehensive reports for selected regions, including insights into the number of protected species, invasive species, and endemic species.

CVAug 18, 2025
2COOOL: 2nd Workshop on the Challenge Of Out-Of-Label Hazards in Autonomous Driving

Ali K. AlShami, Ryan Rabinowitz, Maged Shoman et al.

As the computer vision community advances autonomous driving algorithms, integrating vision-based insights with sensor data remains essential for improving perception, decision making, planning, prediction, simulation, and control. Yet we must ask: Why don't we have entirely safe self-driving cars yet? A key part of the answer lies in addressing novel scenarios, one of the most critical barriers to real-world deployment. Our 2COOOL workshop provides a dedicated forum for researchers and industry experts to push the state of the art in novelty handling, including out-of-distribution hazard detection, vision-language models for hazard understanding, new benchmarking and methodologies, and safe autonomous driving practices. The 2nd Workshop on the Challenge of Out-of-Label Hazards in Autonomous Driving (2COOOL) will be held at the International Conference on Computer Vision (ICCV) 2025 in Honolulu, Hawaii, on October 19, 2025. We aim to inspire the development of new algorithms and systems for hazard avoidance, drawing on ideas from anomaly detection, open-set recognition, open-vocabulary modeling, domain adaptation, and related fields. Building on the success of its inaugural edition at the Winter Conference on Applications of Computer Vision (WACV) 2025, the workshop will feature a mix of academic and industry participation.

CVJun 5, 2025
CzechLynx: A Dataset for Individual Identification and Pose Estimation of the Eurasian Lynx

Lukas Picek, Elisa Belotti, Michal Bojda et al.

We introduce CzechLynx, the first large-scale, open-access dataset for individual identification, 2D pose estimation, and instance segmentation of the Eurasian lynx (Lynx lynx). CzechLynx includes more than 30k camera trap images annotated with segmentation masks, identity labels, and 20-point skeletons and covers 219 unique individuals across 15 years of systematic monitoring in two geographically distinct regions: Southwest Bohemia and the Western Carpathians. To increase the data variability, we create a complementary synthetic set with more than 100k photorealistic images generated via a Unity-based pipeline and diffusion-driven text-to-texture modeling, covering diverse environments, poses, and coat-pattern variations. To allow testing generalization across spatial and temporal domains, we define three tailored evaluation protocols/splits: (i) geo-aware, (ii) time-aware open-set, and (iii) time-aware closed-set. This dataset is targeted to be instrumental in benchmarking state-of-the-art models and the development of novel methods for not just individual animal re-identification.

CVJun 13, 2024
WildlifeReID-10k: Wildlife re-identification dataset with 10k individual animals

Lukáš Adam, Vojtěch Čermák, Kostas Papafitsoros et al.

This paper introduces WildlifeReID-10k, a new large-scale re-identification benchmark with more than 10k animal identities of around 33 species across more than 140k images, re-sampled from 37 existing datasets. WildlifeReID-10k covers diverse animal species and poses significant challenges for SoTA methods, ensuring fair and robust evaluation through its time-aware and similarity-aware split protocol. The latter is designed to address the common issue of training-to-test data leakage caused by visually similar images appearing in both training and test sets. The WildlifeReID-10k dataset and benchmark are publicly available on Kaggle, along with strong baselines for both closed-set and open-set evaluation, enabling fair, transparent, and standardized evaluation of not just multi-species animal re-identification models.