CVOct 10, 2025

TARO: Toward Semantically Rich Open-World Object Detection

arXiv:2510.09173v1h-index: 4
Originality Incremental advance
AI Analysis

This addresses the need for more descriptive object detection in safety-critical applications like autonomous driving, though it is an incremental improvement over existing open-set methods.

The paper tackles the problem of open-world object detection by moving beyond a single 'Unknown' class to assign unknown objects into coarse semantic categories, achieving up to 29.9% categorization of unknowns and reducing confusion with known classes.

Modern object detectors are largely confined to a "closed-world" assumption, limiting them to a predefined set of classes and posing risks when encountering novel objects in real-world scenarios. While open-set detection methods aim to address this by identifying such instances as 'Unknown', this is often insufficient. Rather than treating all unknowns as a single class, assigning them more descriptive subcategories can enhance decision-making in safety-critical contexts. For example, identifying an object as an 'Unknown Animal' (requiring an urgent stop) versus 'Unknown Debris' (requiring a safe lane change) is far more useful than just 'Unknown' in autonomous driving. To bridge this gap, we introduce TARO, a novel detection framework that not only identifies unknown objects but also classifies them into coarse parent categories within a semantic hierarchy. TARO employs a unique architecture with a sparsemax-based head for modeling objectness, a hierarchy-guided relabeling component that provides auxiliary supervision, and a classification module that learns hierarchical relationships. Experiments show TARO can categorize up to 29.9% of unknowns into meaningful coarse classes, significantly reduce confusion between unknown and known classes, and achieve competitive performance in both unknown recall and known mAP. Code will be made available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes