RO CVMay 13, 2025

Parameter-Efficient Fine-Tuning of Vision Foundation Model for Forest Floor Segmentation from UAV Imagery

Mohammad Wasil, Ahmad Drak, Brennan Penfold, Ludovico Scarton, Maximilian Johenneken, Alexander Asteroth, Sebastian Houben

arXiv:2505.08932v13.21 citationsh-index: 7Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses forest monitoring challenges for reforestation efforts, but it is incremental as it applies existing PEFT techniques to a new domain.

The paper tackled forest floor segmentation from UAV imagery by adapting the Segment Anything Model (SAM) using parameter-efficient fine-tuning (PEFT), achieving the highest mean intersection over union (mIoU) with an adapter-based method while LoRA provided a lightweight alternative.

Unmanned Aerial Vehicles (UAVs) are increasingly used for reforestation and forest monitoring, including seed dispersal in hard-to-reach terrains. However, a detailed understanding of the forest floor remains a challenge due to high natural variability, quickly changing environmental parameters, and ambiguous annotations due to unclear definitions. To address this issue, we adapt the Segment Anything Model (SAM), a vision foundation model with strong generalization capabilities, to segment forest floor objects such as tree stumps, vegetation, and woody debris. To this end, we employ parameter-efficient fine-tuning (PEFT) to fine-tune a small subset of additional model parameters while keeping the original weights fixed. We adjust SAM's mask decoder to generate masks corresponding to our dataset categories, allowing for automatic segmentation without manual prompting. Our results show that the adapter-based PEFT method achieves the highest mean intersection over union (mIoU), while Low-rank Adaptation (LoRA), with fewer parameters, offers a lightweight alternative for resource-constrained UAV platforms.

View on arXiv PDF Code

Similar