PapMOT: Exploring Adversarial Patch Attack against Multiple Object Tracking
This addresses a security problem for computer vision systems relying on MOT, such as surveillance or autonomous vehicles, by exposing and exploiting physical-world vulnerabilities, representing a novel attack method rather than an incremental improvement.
The paper tackles the vulnerability of multiple object tracking (MOT) methods to adversarial attacks by proposing PapMOT, which generates physical adversarial patches that degrade tracking performance in both digital and physical scenarios, achieving successful attacks on various MOT trackers in evaluations.
Tracking multiple objects in a continuous video stream is crucial for many computer vision tasks. It involves detecting and associating objects with their respective identities across successive frames. Despite significant progress made in multiple object tracking (MOT), recent studies have revealed the vulnerability of existing MOT methods to adversarial attacks. Nevertheless, all of these attacks belong to digital attacks that inject pixel-level noise into input images, and are therefore ineffective in physical scenarios. To fill this gap, we propose PapMOT, which can generate physical adversarial patches against MOT for both digital and physical scenarios. Besides attacking the detection mechanism, PapMOT also optimizes a printable patch that can be detected as new targets to mislead the identity association process. Moreover, we introduce a patch enhancement strategy to further degrade the temporal consistency of tracking results across video frames, resulting in more aggressive attacks. We further develop new evaluation metrics to assess the robustness of MOT against such attacks. Extensive evaluations on multiple datasets demonstrate that our PapMOT can successfully attack various architectures of MOT trackers in digital scenarios. We also validate the effectiveness of PapMOT for physical attacks by deploying printed adversarial patches in the real world.