CVNov 22, 2023

T-Rex: Counting by Visual Prompting

arXiv:2311.13596v122 citationsh-index: 26
Originality Incremental advance
AI Analysis

This addresses the need for flexible, user-guided object counting in diverse visual scenarios, though it builds incrementally on existing detection and prompting methods.

The paper tackles the problem of interactive object counting by introducing T-Rex, a model that uses visual prompts to detect and count objects, achieving state-of-the-art performance on class-agnostic benchmarks and demonstrating strong zero-shot capabilities.

We introduce T-Rex, an interactive object counting model designed to first detect and then count any objects. We formulate object counting as an open-set object detection task with the integration of visual prompts. Users can specify the objects of interest by marking points or boxes on a reference image, and T-Rex then detects all objects with a similar pattern. Guided by the visual feedback from T-Rex, users can also interactively refine the counting results by prompting on missing or falsely-detected objects. T-Rex has achieved state-of-the-art performance on several class-agnostic counting benchmarks. To further exploit its potential, we established a new counting benchmark encompassing diverse scenarios and challenges. Both quantitative and qualitative results show that T-Rex possesses exceptional zero-shot counting capabilities. We also present various practical application scenarios for T-Rex, illustrating its potential in the realm of visual prompting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes