CVLGNov 3, 2025

Extremal Contours: Gradient-driven contours for compact visual attribution

arXiv:2511.01411v1h-index: 16
Originality Incremental advance
AI Analysis

This addresses the need for interpretable and robust visual attribution in computer vision, particularly for users relying on model explanations, though it is an incremental improvement over existing perturbation-based methods.

The paper tackles the problem of generating compact and faithful visual explanations for vision models by introducing a training-free method that uses gradient-driven, smooth contours instead of dense masks, achieving comparable fidelity to dense masks while improving run-to-run consistency and reducing complexity, with gains such as over 15% higher relevance mass on self-supervised DINO models.

Faithful yet compact explanations for vision models remain a challenge, as commonly used dense perturbation masks are often fragmented and overfitted, needing careful post-processing. Here, we present a training-free explanation method that replaces dense masks with smooth tunable contours. A star-convex region is parameterized by a truncated Fourier series and optimized under an extremal preserve/delete objective using the classifier gradients. The approach guarantees a single, simply connected mask, cuts the number of free parameters by orders of magnitude, and yields stable boundary updates without cleanup. Restricting solutions to low-dimensional, smooth contours makes the method robust to adversarial masking artifacts. On ImageNet classifiers, it matches the extremal fidelity of dense masks while producing compact, interpretable regions with improved run-to-run consistency. Explicit area control also enables importance contour maps, yielding a transparent fidelity-area profiles. Finally, we extend the approach to multi-contour and show how it can localize multiple objects within the same framework. Across benchmarks, the method achieves higher relevance mass and lower complexity than gradient and perturbation based baselines, with especially strong gains on self-supervised DINO models where it improves relevance mass by over 15% and maintains positive faithfulness correlations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes