VibrantVS: A high-resolution multi-task transformer for forest canopy height estimation
This provides improved ecological monitoring and land management tools, such as for wildfire mitigation, but is incremental as it builds on existing transformer methods for a specific domain.
The paper tackled forest canopy height estimation using a multi-task vision transformer on NAIP imagery, achieving higher accuracy and precision across diverse ecoregions in the western U.S. compared to benchmark models, with the ability to update inferences every three years or less.
This paper explores the application of a novel multi-task vision transformer (ViT) model for the estimation of canopy height models (CHMs) using 4-band National Agriculture Imagery Program (NAIP) imagery across the western United States. We compare the effectiveness of this model in terms of accuracy and precision aggregated across ecoregions and class heights versus three other benchmark peer-reviewed models. Key findings suggest that, while other benchmark models can provide high precision in localized areas, the VibrantVS model has substantial advantages across a broad reach of ecoregions in the western United States with higher accuracy, higher precision, the ability to generate updated inference at a cadence of three years or less, and high spatial resolution. The VibrantVS model provides significant value for ecological monitoring and land management decisions, including for wildfire mitigation.