CVJun 20, 2025

DepthVanish: Optimizing Adversarial Interval Structures for Stereo-Depth-Invisible Patches

arXiv:2506.16690v2h-index: 6Has Code
Originality Highly original
AI Analysis

This work addresses vulnerabilities in stereo depth estimation for autonomous driving and robotics, offering a practical tool for security assessment, though it is incremental as it builds on prior texture-based attacks.

The paper tackles the problem of adversarial attacks on stereo depth estimation by discovering that introducing regular intervals in repeated textures enhances patch performance, and develops a method that optimizes both structure and texture to attack advanced methods and commercial cameras in real-world conditions, achieving successful attacks on RAFT-Stereo, STTR, and Intel RealSense.

Stereo depth estimation is a critical task in autonomous driving and robotics, where inaccuracies (such as misidentifying nearby objects as distant) can lead to dangerous situations. Adversarial attacks against stereo depth estimation can help reveal vulnerabilities before deployment. Previous works have shown that repeating optimized textures can effectively mislead stereo depth estimation in digital settings. However, our research reveals that these naively repeated textures perform poorly in physical implementations, i.e., when deployed as patches, limiting their practical utility for stress-testing stereo depth estimation systems. In this work, for the first time, we discover that introducing regular intervals among the repeated textures, creating a grid structure, significantly enhances the patch's attack performance. Through extensive experimentation, we analyze how variations of this novel structure influence the adversarial effectiveness. Based on these insights, we develop a novel stereo depth attack that jointly optimizes both the interval structure and texture elements. Our generated adversarial patches can be inserted into any scenes and successfully attack advanced stereo depth estimation methods of different paradigms, i.e., RAFT-Stereo and STTR. Most critically, our patch can also attack commercial RGB-D cameras (Intel RealSense) in real-world conditions, demonstrating their practical relevance for security assessment of stereo systems. The code is officially released at: https://github.com/WiWiN42/DepthVanish

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes