CVSep 18, 2025

Lost in Translation? Vocabulary Alignment for Source-Free Adaptation in Open-Vocabulary Semantic Segmentation

Silvio Mazzucco, Carl Persson, Mattia Segu, Pier Luigi Dovesi, Federico Tombari, Luc Van Gool, Matteo Poggi

arXiv:2509.15225v33.61 citationsh-index: 28

Originality Incremental advance

AI Analysis

This addresses domain adaptation for vision-language models in segmentation without source data, offering an incremental improvement for computer vision applications.

The paper tackles source-free domain adaptation for open-vocabulary semantic segmentation by introducing VocAlign, which uses vocabulary alignment and a student-teacher paradigm to improve pseudo-label generation. It achieves a 6.11 mIoU improvement on CityScapes and sets a new standard on zero-shot segmentation benchmarks.

We introduce VocAlign, a novel source-free domain adaptation framework specifically designed for VLMs in open-vocabulary semantic segmentation. Our method adopts a student-teacher paradigm enhanced with a vocabulary alignment strategy, which improves pseudo-label generation by incorporating additional class concepts. To ensure efficiency, we use Low-Rank Adaptation (LoRA) to fine-tune the model, preserving its original capabilities while minimizing computational overhead. In addition, we propose a Top-K class selection mechanism for the student model, which significantly reduces memory requirements while further improving adaptation performance. Our approach achieves a notable 6.11 mIoU improvement on the CityScapes dataset and demonstrates superior performance on zero-shot segmentation benchmarks, setting a new standard for source-free adaptation in the open-vocabulary setting.

View on arXiv PDF

Similar