CVOct 8, 2025

DADO: A Depth-Attention framework for Object Discovery

arXiv:2510.07089v1h-index: 45CAIP
Originality Incremental advance
AI Analysis

This addresses the problem of identifying objects without labels for computer vision applications, representing an incremental improvement over existing methods.

The paper tackles unsupervised object discovery in images by introducing DADO, a model that combines attention and depth with dynamic weighting, achieving state-of-the-art accuracy and robustness on benchmarks without fine-tuning.

Unsupervised object discovery, the task of identifying and localizing objects in images without human-annotated labels, remains a significant challenge and a growing focus in computer vision. In this work, we introduce a novel model, DADO (Depth-Attention self-supervised technique for Discovering unseen Objects), which combines an attention mechanism and a depth model to identify potential objects in images. To address challenges such as noisy attention maps or complex scenes with varying depth planes, DADO employs dynamic weighting to adaptively emphasize attention or depth features based on the global characteristics of each image. We evaluated DADO on standard benchmarks, where it outperforms state-of-the-art methods in object discovery accuracy and robustness without the need for fine-tuning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes