David Russell

11.8CVJun 3, 2025Code

Zero-Shot Tree Detection and Segmentation from Aerial Forest Imagery

Michelle Chen, David Russell, Amritha Pallavoor et al.

Large-scale delineation of individual trees from remote sensing imagery is crucial to the advancement of ecological research, particularly as climate change and other environmental factors rapidly transform forest landscapes across the world. Current RGB tree segmentation methods rely on training specialized machine learning models with labeled tree datasets. While these learning-based approaches can outperform manual data collection when accurate, the existing models still depend on training data that's hard to scale. In this paper, we investigate the efficacy of using a state-of-the-art image segmentation model, Segment Anything Model 2 (SAM2), in a zero-shot manner for individual tree detection and segmentation. We evaluate a pretrained SAM2 model on two tasks in this domain: (1) zero-shot segmentation and (2) zero-shot transfer by using predictions from an existing tree detection model as prompts. Our results suggest that SAM2 not only has impressive generalization capabilities, but also can form a natural synergy with specialized methods trained on in-domain labeled data. We find that applying large pretrained models to problems in remote sensing is a promising avenue for future progress. We make our code available at: https://github.com/open-forest-observatory/tree-detection-framework.

2.0CVMay 15, 2024Code

Classifying geospatial objects from multiview aerial imagery using semantic meshes

David Russell, Ben Weinstein, David Wettergreen et al. · cmu

Aerial imagery is increasingly used in Earth science and natural resource management as a complement to labor-intensive ground-based surveys. Aerial systems can collect overlapping images that provide multiple views of each location from different perspectives. However, most prediction approaches (e.g. for tree species classification) use a single, synthesized top-down "orthomosaic" image as input that contains little to no information about the vertical aspects of objects and may include processing artifacts. We propose an alternate approach that generates predictions directly on the raw images and accurately maps these predictions into geospatial coordinates using semantic meshes. This method$\unicode{x2013}$released as a user-friendly open-source toolkit$\unicode{x2013}$enables analysts to use the highest quality data for predictions, capture information about the sides of objects, and leverage multiple viewpoints of each location for added robustness. We demonstrate the value of this approach on a new benchmark dataset of four forest sites in the western U.S. that consists of drone images, photogrammetry results, predicted tree locations, and species classification data derived from manual surveys. We show that our proposed multiview method improves classification accuracy from 53% to 75% relative to an orthomosaic baseline on a challenging cross-site tree species classification task.

David Russell

2 Papers