CVJun 25, 2025

From Ideal to Real: Unified and Data-Efficient Dense Prediction for Real-World Scenarios

arXiv:2506.20279v21 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses real-world deployment challenges in computer vision for applications requiring pixel-wise annotations, though it is incremental in adapting generative models to dense prediction.

The paper tackles the problem of limited real-world generalization and data scarcity in dense prediction tasks by introducing DenseWorld, a benchmark with 25 tasks, and proposing DenseDiT, a unified method that achieves superior results using less than 0.01% training data of baselines.

Dense prediction tasks hold significant importance of computer vision, aiming to learn pixel-wise annotated labels for input images. Despite advances in this field, existing methods primarily focus on idealized conditions, exhibiting limited real-world generalization and struggling with the acute scarcity of real-world data in practical scenarios. To systematically study this problem, we first introduce DenseWorld, a benchmark spanning a broad set of 25 dense prediction tasks that correspond to urgent real-world applications, featuring unified evaluation across tasks. We then propose DenseDiT, which exploits generative models' visual priors to perform diverse real-world dense prediction tasks through a unified strategy. DenseDiT combines a parameter-reuse mechanism and two lightweight branches that adaptively integrate multi-scale context. This design enables DenseDiT to achieve efficient tuning with less than 0.1% additional parameters, activating the visual priors while effectively adapting to diverse real-world dense prediction tasks. Evaluations on DenseWorld reveal significant performance drops in existing general and specialized baselines, highlighting their limited real-world generalization. In contrast, DenseDiT achieves superior results using less than 0.01% training data of baselines, underscoring its practical value for real-world deployment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes