CVCROct 26, 2021

Semantic Host-free Trojan Attack

arXiv:2110.13414v11 citations
Originality Incremental advance
AI Analysis

This addresses a security vulnerability in machine learning models, making attacks more stealthy and applicable in real-world scenarios, though it is an incremental improvement over existing Trojan attacks.

The paper tackles the problem of making Trojan attacks more practical and harder to detect by using triggers fixed in semantic space rather than pixel space, resulting in an attack that generalizes to new patterns and bypasses state-of-the-art defenses with only a small number of training patterns.

In this paper, we propose a novel host-free Trojan attack with triggers that are fixed in the semantic space but not necessarily in the pixel space. In contrast to existing Trojan attacks which use clean input images as hosts to carry small, meaningless trigger patterns, our attack considers triggers as full-sized images belonging to a semantically meaningful object class. Since in our attack, the backdoored classifier is encouraged to memorize the abstract semantics of the trigger images than any specific fixed pattern, it can be later triggered by semantically similar but different looking images. This makes our attack more practical to be applied in the real-world and harder to defend against. Extensive experimental results demonstrate that with only a small number of Trojan patterns for training, our attack can generalize well to new patterns of the same Trojan class and can bypass state-of-the-art defense methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes