CVLGApr 14, 2025

FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation

arXiv:2504.10487v25 citationsh-index: 21Has Code
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in open-vocabulary semantic segmentation for computer vision applications, offering a practical improvement that is orthogonal to existing methods.

The paper tackles the problem of open-vocabulary semantic segmentation by challenging the conventional use of averaged class-wise text embeddings, finding that single-template classifiers (class-experts) outperform them, and introduces FLOSS, a plug-and-play method that improves state-of-the-art models without additional labels or training, achieving consistent enhancements across datasets and in low-data scenarios.

In this paper, we challenge the conventional practice in Open-Vocabulary Semantic Segmentation (OVSS) of using averaged class-wise text embeddings, which are typically obtained by encoding each class name with multiple templates (e.g., a photo of <class>, a sketch of a <class>). We investigate the impact of templates for OVSS, and find that for each class, there exist single-template classifiers--which we refer to as class-experts--that significantly outperform the conventional averaged classifier. First, to identify these class-experts, we introduce a novel approach that estimates them without any labeled data or training. By leveraging the class-wise prediction entropy of single-template classifiers, we select those yielding the lowest entropy as the most reliable class-experts. Second, we combine the outputs of class-experts in a new fusion process. Our plug-and-play method, coined FLOSS, is orthogonal and complementary to existing OVSS methods, offering an improvement without the need for additional labels or training. Extensive experiments show that FLOSS consistently enhances state-of-the-art OVSS models, generalizes well across datasets with different distribution shifts, and delivers substantial improvements in low-data scenarios where only a few unlabeled images are available. Our code is available at https://github.com/yasserben/FLOSS .

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes