CVAILGROSep 25, 2024

First Place Solution to the ECCV 2024 BRAVO Challenge: Evaluating Robustness of Vision Foundation Models for Semantic Segmentation

arXiv:2409.17208v23 citationsh-index: 19Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses robustness evaluation for vision models in semantic segmentation, but it is incremental as it applies an existing method to a new challenge.

The paper tackled the problem of evaluating robustness in vision foundation models for semantic segmentation by fine-tuning DINOv2 with a segmentation decoder on Cityscapes, achieving first place in the ECCV 2024 BRAVO Challenge.

In this report, we present the first place solution to the ECCV 2024 BRAVO Challenge, where a model is trained on Cityscapes and its robustness is evaluated on several out-of-distribution datasets. Our solution leverages the powerful representations learned by vision foundation models, by attaching a simple segmentation decoder to DINOv2 and fine-tuning the entire model. This approach outperforms more complex existing approaches, and achieves first place in the challenge. Our code is publicly available at https://github.com/tue-mps/benchmark-vfm-ss.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes