CVJun 19, 2025

Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation

arXiv:2506.16058v25 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses the evaluation gap in open-vocabulary segmentation by providing a more challenging benchmark, though the method itself appears incremental.

The authors identified that existing open-vocabulary segmentation benchmarks fail to adequately test models' ability to handle truly diverse concepts, as they share similar semantic spaces with training data. They introduced OpenBench, a new benchmark with significantly different semantics, and proposed OVSNet, which achieved state-of-the-art results on both existing datasets and OpenBench.

Open-vocabulary segmentation aims to achieve segmentation of arbitrary categories given unlimited text inputs as guidance. To achieve this, recent works have focused on developing various technical routes to exploit the potential of large-scale pre-trained vision-language models and have made significant progress on existing benchmarks. However, we find that existing test sets are limited in measuring the models' comprehension of ``open-vocabulary" concepts, as their semantic space closely resembles the training space, even with many overlapping categories. To this end, we present a new benchmark named OpenBench that differs significantly from the training semantics. It is designed to better assess the model's ability to understand and segment a wide range of real-world concepts. When testing existing methods on OpenBench, we find that their performance diverges from the conclusions drawn on existing test sets. In addition, we propose a method named OVSNet to improve the segmentation performance for diverse and open scenarios. Through elaborate fusion of heterogeneous features and cost-free expansion of the training space, OVSNet achieves state-of-the-art results on both existing datasets and our proposed OpenBench. Corresponding analysis demonstrate the soundness and effectiveness of our proposed benchmark and method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes